History log of /freebsd-current/sys/kern/kern_shutdown.c
Revision Date Author Comments
# 2cb49090 30-Apr-2024 Justin Hibbits <jhibbits@FreeBSD.org>

cons: Add boot option to mute boot messages after banner

This is useful for embedded systems, where it provides feedback that the
kernel has booted, but avoids printing the probe messages. If both
mutemsgs and verbose are set, verbose cancels the mute.

Additionally, this unmutes the console on panic, so a user can see what
happened leading up to the panic.

Obtained from: Juniper Networks, Inc.


# 2aee804c 27-Mar-2024 Stephen J. Kiernan <stevek@FreeBSD.org>

kerneldump: Add flag to indicate kernel core was successfully dumped

This allows for shutdown_final EVENTHANDLERs to know that a core dump
successfully occurred. Embedded systems may want to record this fact
or act on it.

Obtained from: Juniper Networks, Inc.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44542


# e4ab361e 06-Feb-2024 Andriy Gapon <avg@FreeBSD.org>

fix poweroff regression from 9cdf326b4f by delaying shutdown_halt

The regression affected ACPI-based systems without EFI poweroff support
(including VMs).

The key reason for the regression is that I overlooked that poweroff is
requested by RB_POWEROFF | RB_HALT combination of flags. In my opinion,
that command is a bit bipolar, but since we've been doing that forever,
then so be it. Because of that flag combination, the order of
shutdown_final handlers that check for either flag does matter.

Some additional complexity comes from platform-specific shutdown_final
handlers that aim to handle multiple reboot options at once. E.g.,
acpi_shutdown_final handles both poweroff and reboot / reset. As
explained in 9cdf326b4f, such a handler must run after shutdown_panic to
give it a chance. But as the change revealed, the handler must also run
before shutdown_halt, so that the system can actually power off before
entering the halt limbo.

Previously, shutdown_panic and shutdown_halt had the same priority which
appears to be incompatible with handlers that can do both poweroff and
reset.

The above also applies to power cycle handlers.

PR: 276784
Reported by: many
Tested by: Katsuyuki Miyoshi <katsubsd@gmail.com>,
Masachika ISHIZUKA <ish@amail.plala.or.jp>
Fixes: 9cdf326b4fae run acpi_shutdown_final later to give other handlers a chance
MFC after: 1 week


# 6b353101 18-Jan-2024 Olivier Certner <olce@FreeBSD.org>

SCHEDULER_STOPPED(): Rely on a global variable

A commit from 2012 (5d7380f8e34f0083, r228424) introduced
'td_stopsched', on the ground that a global variable would cause all
CPUs to have a copy of it in their cache, and consequently of all other
variables sharing the same cache line.

This is really a problem only if that cache line sees relatively
frequent modifications. This was unlikely to be the case back then
because nearby variables are almost never modified as well. In any
case, today we have a new tool at our disposal to ensure that this
variable goes into a read-mostly section containing frequently-accessed
variables ('__read_frequently'). Most of the cache lines covering this
section are likely to always be in every CPU cache. This makes the
second reason stated in the commit message (ensuring the field is in the
same cache line as some lock-related fields, since these are accessed in
close proximity) moot, as well as the second order effect of requiring
an additional line to be present in the cache (the one containing the
new 'scheduler_stopped' boolean, see below).

From a pure logical point of view, whether the scheduler is stopped is
a global state and is certainly not a per-thread quality.

Consequently, remove 'td_stopsched', which immediately frees a byte in
'struct thread'. Currently, the latter's size (and layout) stays
unchanged, but some of the later re-orderings will probably benefit from
this removal. Available bytes at the original position for
'td_stopsched' have been made explicit with the addition of the
'_td_pad0' member.

Store the global state in the new 'scheduler_stopped' boolean, which is
annotated with '__read_frequently'.

Replace uses of SCHEDULER_STOPPED_TD() with SCHEDULER_STOPPER() and
remove the former as it is now unnecessary.

Reviewed by: markj, kib
Approved by: markj (mentor)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43572


# 12d6a032 18-Jan-2024 Olivier Certner <olce@FreeBSD.org>

Annotate 'rebooting' with __read_mostly

While here, put such annotation after the variable for 'dumping', since
it concerns the variable and not the type.

Reviewed by: markj
Approved by: markj (mentor)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43570


# eaed922e 18-Jan-2024 Olivier Certner <olce@FreeBSD.org>

panic()/KERNEL_PANICKED(): Move back to using 'panicstr' as a flag

Currently, no performance-critical path tests for a panic. Moreover, we
today have KERNEL_PANICKED() which wraps the test into
__predict_false(), already catering to those (potential) use cases.
Also, in practice we don't support 64-bit architectures without caches,
so reading an 'int' instead of a pointer doesn't (directly) save any
memory access. Finally, 'panicked' is redundant with 'panicstr' (and
wastes a tiny amount of memory).

Consequently:
1. Use again 'panicstr' as a flag indicating that the system is
panicking. To this end:
- Modify panic() so that it ensures this pointer is set to some
non-NULL value even if the caller didn't pass any panic string.
- Modify KERNEL_PANICKED() to test for 'panicstr'.
- Remove 'panicked'.
2. Annotate 'panicstr' with '__read_mostly' (instead of using
'__read_frequently' as for 'panicked'). This may have to be changed if,
in the future, some performance-intensive path needs to test it.
3. Convert a few more direct tests of 'panicstr' to using
KERNEL_PANICKED().

Reviewed by: kib, markj, emaste
Approved by: markj (mentor)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43569


# 29363fb4 23-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by: Netflix


# 4e78a766 23-Nov-2023 Mitchell Horne <mhorne@FreeBSD.org>

kern_reboot(): don't clear kdb_active

It is possible to reach this function from ddb via the "reset" command.
When this happens, we don't actually exit kdb, meaning we never execute
the latter steps of kdb_break() to restore the system state (e.g.
re-enable scheduler).

Therefore, we should not clear the kdb_active flag in this function, as
the debugger is still active. Put differently, kern_reboot() is not an
authority on kdb state, and should not touch it. The original motivation
for this assignment is not clear; I have checked thoroughly and I am
convinced it is not required by any reset code.

This fixes an edge case where a panic can be triggered during reset from
ddb:
1. Enter ddb via keyboard break sequence (KERNEL_PANICKED() == false &&
td->td_critnest > 0)
2. Execute the "reset" command
3. kern_reboot() sets kdb_active = false
4. A witness_checkorder() call via shutdown handler sees !kdb_active
and panics

Reviewed by: imp, markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42684


# 960612a1 23-Nov-2023 Mitchell Horne <mhorne@FreeBSD.org>

shutdown: tweak kproc/kthread shutdown check

This is to handle the case where the system has not panicked but the
debugger is active, where we still can't wait for thread termination.

Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42683


# deacab75 04-Nov-2023 Mark Johnston <markj@FreeBSD.org>

reboot: Avoid unlocking Giant if the scheduler is stopped

When the scheduler is stopped, mtx_unlock() turns into a no-op, so the
loop

while (mtx_owned(&Giant))
mtx_unlock(&Giant);

runs forever if the calling thread has Giant locked.

Reviewed by: mhorne
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42460


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# cab10561 25-Oct-2022 Mark Johnston <markj@FreeBSD.org>

kdb: Modify securelevel policy

Currently, sysctls which enable KDB in some way are flagged with
CTLFLAG_SECURE, meaning that you can't modify them if securelevel > 0.
This is so that KDB cannot be used to lower a running system's
securelevel, see commit 3d7618d8bf0b7. However, the newer mac_ddb(4)
restricts DDB operations which could be abused to lower securelevel
while retaining some ability to gather useful debugging information.

To enable the use of KDB (specifically, DDB) on systems with a raised
securelevel, change the KDB sysctl policy: rather than relying on
CTLFLAG_SECURE, add a check of the current securelevel to kdb_trap().
If the securelevel is raised, only pass control to the backend if MAC
specifically grants access; otherwise simply check to see if mac_ddb
vetoes the request, as before.

Add a new secure sysctl, debug.kdb.enter_securelevel, to override this
behaviour. That is, the sysctl lets one enter a KDB backend even with a
raised securelevel, so long as it is set before the securelevel is
raised.

Reviewed by: mhorne, stevek
MFC after: 1 month
Sponsored by: Juniper Networks
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D37122


# c3179891 20-Mar-2023 Mark Johnston <markj@FreeBSD.org>

kerneldump: Inline dump_savectx() into its callers

The callers of dump_savectx() (i.e., doadump() and livedump_start())
subsequently call dumpsys()/minidumpsys(), which dump the calling
thread's stack when writing the dump. If dump_savectx() gets its own
stack frame, that frame might be clobbered when its caller later calls
dumpsys()/minidumpsys(), making it difficult for debuggers to unwind the
stack.

Fix this by making dump_savectx() a macro, so that savectx() is always
called directly by the function which subsequently calls
dumpsys()/minidumpsys().

This fixes stack unwinding for the panicking thread from arm64
minidumps. The same happened to work on amd64, but kgdb reports the
dump_savectx() calls as coming from dumpsys(), so in that case it
appears to work by accident.

Fixes: c9114f9f86f9 ("Add new vnode dumper to support live minidumps")
Reviewed by: mhorne, jhb
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D39151


# 627ca221 23-Jan-2023 Mitchell Horne <mhorne@FreeBSD.org>

kern_reboot: unconditionally call shutdown_reset()

Currently shutdown_reset() is registered as the final entry of the
shutdown_final event handler. However, if a panic occurs early in boot
before the event is registered (SI_SUB_INTRINSIC), we may end up
spinning in the subsequent infinite for loop and failing to reset
altogether. Instead we can simply call this function unconditionally.

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D37981


# 84ec7df0 12-Jul-2022 Colin Percival <cperciva@FreeBSD.org>

Add kern.reboot_wait_time sysctl

Historic FreeBSD behaviour (dating back to 1994-04-02) when rebooting
is to print "Rebooting..." and then
/* wait 1 sec for printf's to complete and be read */

Prior to April 1994, there was a 100 ms delay (added 1993-11-12).

Since (a) most users will already be aware that the system is rebooting
and do not need to take time to read an additional message to that
effect, and (b) most FreeBSD systems don't have anyone actively looking
at the console anyway, this delay no longer serves much purpose.

This commit adds a kern.reboot_wait_time sysctl which defaults to 0;
historic behaviour can be regained by setting it to 1.

Reviewed by: imp
Relnotes: FreeBSD now reboots faster; to restore the traditional
wait after printing "Rebooting..." to the console, set
kern.reboot_wait_time=1 (or more).
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D35796


# c84c5e00 18-Jul-2022 Mitchell Horne <mhorne@FreeBSD.org>

ddb: annotate some commands with DB_CMD_MEMSAFE

This is not completely exhaustive, but covers a large majority of
commands in the tree.

Reviewed by: markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35583


# 35eb9b10 02-Jun-2022 Mitchell Horne <mhorne@FreeBSD.org>

Use KERNEL_PANICKED() in more places

This is slightly more optimized than checking panicstr directly. For
most of these instances performance doesn't matter, but let's make
KERNEL_PANICKED() the common idiom.

Reviewed by: mjg
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D35373


# db71383b 13-May-2022 Mitchell Horne <mhorne@FreeBSD.org>

kerneldump: remove physical from dump routines

It is unused, especially now that the underlying d_dumper methods do not
accept the argument.

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D35174


# 489ba222 13-May-2022 Mitchell Horne <mhorne@FreeBSD.org>

kerneldump: remove physical argument from d_dumper

The physical address argument is essentially ignored by every dumper
method. In addition, the dump routines don't actually pass a real
address; every call to dump_append() passes a value of zero for
physical.

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D35173


# c9114f9f 23-Mar-2021 Mitchell Horne <mhorne@FreeBSD.org>

Add new vnode dumper to support live minidumps

This dumper can instantiate and write the dump's contents to a
file-backed vnode.

Unlike existing disk or network dumpers, the vnode dumper should not be
invoked during a system panic, and therefore is not added to the global
dumper_configs list. Instead, the vnode dumper is constructed ad-hoc
when a live dump is requested using the new ioctl on /dev/mem. This is
similar in spirit to a kgdb session against the live system via
/dev/mem.

As described briefly in the mem(4) man page, live dumps are not
guaranteed to result in a usuable output file, but offer some debugging
value where forcefully panicing a system to dump its memory is not
desirable/feasible.

A future change to savecore(8) will add an option to save a live dump.

Reviewed by: markj, Pau Amma <pauamma@gundo.com> (manpages)
Discussed with: kib
MFC after: 3 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D33813


# 59c27ea1 09-Aug-2021 Mitchell Horne <mhorne@FreeBSD.org>

Split out dumper allocation from list insertion

Add a new function, dumper_create(), to allocate a dumper.
dumper_insert() will call this function and retains the existing
behaviour.

This is desirable for performing live dumps of the system. Here, there
is a need to allocate and configure a dumper structure that is invoked
outside of the typical debugger context. Therefore, it should be
excluded from the list of panic-time dumpers.

free_single_dumper() is made public and renamed to dumper_destroy().

Reviewed by: kib, markj
MFC after: 1 week
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34068


# 669d5ea4 02-Apr-2022 Gordon Bergling <gbe@FreeBSD.org>

kern: Fix a typo in a source code comment

- s/paniced/panicked/

MFC after: 3 days


# 5a8fceb3 21-Feb-2022 Mitchell Horne <mhorne@FreeBSD.org>

boottrace: trace annotations for startup and shutdown

Add trace events for execution of SYSINITs (both static and dynamically
loaded), and to the various steps in the shutdown/panic/reboot paths.

Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
X-NetApp-PR: #23
Differential Revision: https://reviews.freebsd.org/D30187


# 800e7495 28-Sep-2021 Mitchell Horne <mhorne@FreeBSD.org>

boot(9): update to match reality

This function was renamed to kern_reboot() in 2010, but the man page has
failed to keep in sync. Bring it up to date on the rename, add the
shutdown hooks to the synopsis, and document the (obvious) fact that
kern_reboot() does not return.

Fix an outdated reference to the old name in kern_reboot(), and leave a
reference to the man page so future readers might find it before any
large changes.

Reviewed by: imp, markj
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32085


# 13a58148 06-Aug-2021 Eric van Gyzen <vangyzen@FreeBSD.org>

netdump: send key before dump, in case dump fails

Previously, if an encrypted netdump failed, such as due to a timeout or
network failure, the key was not saved, so a partial dump was
completely useless.

Send the key first, so the partial dump can be decrypted, because even a
partial dump can be useful.

Reviewed by: bdrewery, markj
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D31453


# 67f508db 10-Aug-2021 Alexander Motin <mav@FreeBSD.org>

Mark some sysctls as CTLFLAG_MPSAFE.

MFC after: 2 weeks


# c8a96cdc 19-Nov-2020 Mitchell Horne <mhorne@FreeBSD.org>

Add an option for entering KDB on recursive panics

There are many cases where one would choose avoid entering the debugger
on a normal panic, opting instead to reboot and possibly save a kernel
dump. However, recursive kernel panics are an unusual case that might
warrant attention from a human, so provide a secondary tunable,
debug.debugger_on_recursive_panic, to allow entering the debugger only
when this occurs.

For for simplicity in maintaining existing behaviour, the tunable
defaults to zero.

Reviewed by: cem, markj
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D27271


# 6debfd4b 06-Oct-2020 Mitchell Horne <mhorne@FreeBSD.org>

Remove unused function cpu_boot()

The prototype was added with the creation of kern_shutdown.c in r17658,
but it appears to have never been implemented. Remove it now.

Reviewed by: cem, kib
Differential Revision: https://reviews.freebsd.org/D26702


# 6255e8c8 27-Aug-2020 Mark Johnston <markj@FreeBSD.org>

Fix writing of the final block of encrypted, compressed kernel dumps.

Previously any residual data in the final block of a compressed kernel
dump would be written unencrypted. Note, such a configuration already
does not work properly when using AES-CBC since the compressed data is
typically not a multiple of the AES block length in size and EKCD does
not implement any padding scheme. However, EKCD more recently gained
support for using the ChaCha20 cipher, which being a stream cipher does
not have this problem.

Submitted by: sigsys@gmail.com
Reviewed by: cem
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D26188


# 4a711b8d 25-Jun-2020 John Baldwin <jhb@FreeBSD.org>

Use zfree() instead of explicit_bzero() and free().

In addition to reducing lines of code, this also ensures that the full
allocation is always zeroed avoiding possible bugs with incorrect
lengths passed to explicit_bzero().

Suggested by: cem
Reviewed by: cem, delphij
Approved by: csprng (cem)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25435


# ba0ced82 25-Apr-2020 Eric van Gyzen <vangyzen@FreeBSD.org>

Fix handling of NMIs from unknown sources (BMC, hypervisor)

Release kernels have no KDB backends enabled, so they discard an NMI
if it is not due to a hardware failure. This includes NMIs from
IPMI BMCs and hypervisors.

Furthermore, the interaction of panic_on_nmi, kdb_on_nmi, and
debugger_on_panic is confusing.

Respond to all NMIs according to panic_on_nmi and debugger_on_panic.
Remove kdb_on_nmi. Expand the meaning of panic_on_nmi by making
it a bitfield. There are currently two bits: one for NMIs due to
hardware failure, and one for all others. Leave room for more.

If panic_on_nmi and debugger_on_panic are both true, don't actually panic,
but directly enter the debugger, to allow someone to leave the debugger
and [hopefully] resume normal execution.

Reviewed by: kib
MFC after: 2 weeks
Relnotes: yes: machdep.kdb_on_nmi is gone; machdep.panic_on_nmi changed
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D24558


# 7a119578 12-Mar-2020 Conrad Meyer <cem@FreeBSD.org>

kern_shutdown: Add missing EKCD ifdef

Submitted by: Puneeth Jothaiah <puneethkumar.jothaia AT dell.com>
Reviewed by: bdrewery
Sponsored by: Dell EMC Isilon


# b05ca429 02-Mar-2020 Pawel Biernacki <kaktus@FreeBSD.org>

sys/: Document few more sysctls.

Submitted by: Antranig Vartanian <antranigv@freebsd.am>
Reviewed by: kaktus
Commented by: jhb
Approved by: kib (mentor)
Sponsored by: illuria security
Differential Revision: https://reviews.freebsd.org/D23759


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# fe20aaec 22-Feb-2020 Ryan Libby <rlibby@FreeBSD.org>

sys/kern: quiet -Wwrite-strings

Quiet a variety of Wwrite-strings warnings in sys/kern at low-impact
sites. This patch avoids addressing certain others which would need to
plumb const through structure definitions.

Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D23798


# d199ad3b 11-Jan-2020 Mateusz Guzik <mjg@FreeBSD.org>

Add "panicked" boolean which can be tested instead of panicstr

The test is performed all the time and reading entire panicstr to do it
wastes space.


# b249ce48 03-Jan-2020 Mateusz Guzik <mjg@FreeBSD.org>

vfs: drop the mostly unused flags argument from VOP_UNLOCK

Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D21427


# abd80ddb 08-Dec-2019 Mateusz Guzik <mjg@FreeBSD.org>

vfs: introduce v_irflag and make v_type smaller

The current vnode layout is not smp-friendly by having frequently read data
avoidably sharing cachelines with very frequently modified fields. In
particular v_iflag inspected for VI_DOOMED can be found in the same line with
v_usecount. Instead make it available in the same cacheline as the v_op, v_data
and v_type which all get read all the time.

v_type is avoidably 4 bytes while the necessary data will easily fit in 1.
Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new
flag field with a new value: VIRF_DOOMED.

Reviewed by: kib, jeff
Differential Revision: https://reviews.freebsd.org/D22715


# 61322a0a 04-Dec-2019 Alexander Motin <mav@FreeBSD.org>

Mark some more hot global variables with __read_mostly.

MFC after: 1 week


# 3ad1ce46 20-Oct-2019 Andriy Gapon <avg@FreeBSD.org>

debug,kassert.warnings is a statistic, not a tunable

MFC after: 1 week


# addccb8c 17-Oct-2019 Conrad Meyer <cem@FreeBSD.org>

Add a very limited DDB dumpon(8)-alike to MI dumper code

This allows ddb(4) commands to construct a static dumperinfo during
panic/debug and invoke doadump(false) using the provided dumper
configuration (always inserted first in the list).

The intended usecase is a ddb(4)-time netdump(4) command.

Reviewed by: markj (earlier version)
Differential Revision: https://reviews.freebsd.org/D21448


# 387df3b8 04-Sep-2019 Andriy Gapon <avg@FreeBSD.org>

shutdown_halt: make sure that watchdog timer is stopped

The point of halt is to keep the machine in limbo.

Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D21222


# 82985292 23-May-2019 Conrad Meyer <cem@FreeBSD.org>

EKCD: Add Chacha20 encryption mode

Add Chacha20 mode to Encrypted Kernel Crash Dumps.

Chacha20 does not require messages to be multiples of block size, so it is
valid to use the cipher on non-block-sized messages without the explicit
padding AES-CBC would require. Therefore, allow use with simultaneous dump
compression. (Continue to disallow use of AES-CBC EKCD with compression.)

dumpon(8) gains a -C cipher flag to select between chacha and aes-cbc.
It defaults to chacha if no -C option is provided. The man page documents this
behavior.

Relnotes: sure
Sponsored by: Dell EMC Isilon


# 6b6e2954 06-May-2019 Conrad Meyer <cem@FreeBSD.org>

List-ify kernel dump device configuration

Allow users to specify multiple dump configurations in a prioritized list.
This enables fallback to secondary device(s) if primary dump fails. E.g.,
one might configure a preference for netdump, but fallback to disk dump as a
second choice if netdump is unavailable.

This change does not list-ify netdump configuration, which is tracked
separately from ordinary disk dumps internally; only one netdump
configuration can be made at a time, for now. It also does not implement
IPv6 netdump.

savecore(8) is already capable of scanning and iterating multiple devices
from /etc/fstab or passed on the command line.

This change doesn't update the rc or loader variables 'dumpdev' in any way;
it can still be set to configure a single dump device, and rc.d/savecore
still uses it as a single device. Only dumpon(8) is updated to be able to
configure the more complicated configurations for now.

As part of revving the ABI, unify netdump and disk dump configuration ioctl
/ structure, and leave room for ipv6 netdump as a future possibility.
Backwards-compatibility ioctls are added to smooth ABI transition,
especially for developers who may not keep kernel and userspace perfectly
synced.

Reviewed by: markj, scottl (earlier version)
Relnotes: maybe
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D19996


# b317cfd4 01-Nov-2018 John Baldwin <jhb@FreeBSD.org>

Don't enter DDB for fatal traps before panic by default.

Add a new 'debugger_on_trap' knob separate from 'debugger_on_panic'
and make the calls to kdb_trap() in MD fatal trap handlers prior to
calling panic() conditional on this new knob instead of
'debugger_on_panic'. Disable the new knob by default. Developers who
wish to recover from a fatal fault by adjusting saved register state
and retrying the faulting instruction can still do so by enabling the
new knob. However, for the more common case this makes the user
experience for panics due to a fatal fault match the user experience
for other panics, e.g. 'c' in DDB will generate a crash dump and
reboot the system rather than being stuck in an infinite loop of fatal
fault messages and DDB prompts.

Reviewed by: kib, avg
MFC after: 2 months
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D17768


# 4ca8c1ef 22-Aug-2018 Conrad Meyer <cem@FreeBSD.org>

KASSERT: Make runtime optionality optional

Add an option, KASSERT_PANIC_OPTIONAL, that allows runtime KASSERT()
behavior changes. When this option is not enabled, code that allows
KASSERTs to become optional is not enabled, and all violated assertions
cause termination.

The runtime KASSERT behavior was added in r243980.

One important distinction here is that panic has __dead2
("attribute((noreturn))"), while kassert_panic does not. Static analyzers
like Coverity understand __dead2. Without it, KASSERTs go misunderstood,
resulting in many false positives that result from violation of program
invariants.

Reviewed by: jhb, jtl, np, vangyzen
Relnotes: yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D16835


# 11d4f748 18-May-2018 Matt Macy <mmacy@FreeBSD.org>

remove unused variable


# bd92e6b6 05-May-2018 Mark Johnston <markj@FreeBSD.org>

Refactor some of the MI kernel dump code in preparation for netdump.

- Add clear_dumper() to complement set_dumper().
- Drain netdump's preallocated mbuf pool when clearing the dumper.
- Don't do bounds checking for dumpers with mediasize 0.
- Add dumper callbacks for initialization for writing out headers.

Reviewed by: sbruno
MFC after: 1 month
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D15252


# 65df1248 24-Apr-2018 Conrad Meyer <cem@FreeBSD.org>

Do not totally silence suppressed secondary kasserts unless debug.kassert.do_log is disabled

To totally silence and ignore secondary kassert violations after a primary
panic, set debug.kassert.do_log=0 and debug.kassert.suppress_in_panic=1.

Additional assertion warnings shouldn't block core dump and may alert the
developer to another erroneous condition. Secondary stack traces may be
printed, identically to the unsuppressed case where panic() is reentered --
controlled via debug.trace_all_panics.

Sponsored by: Dell EMC Isilon


# 07aa6ea6 24-Apr-2018 Conrad Meyer <cem@FreeBSD.org>

Fix debug.kassert.do_log description text

This has been an (incorrect) copy-paste duplicate of debug.kassert.warn_only
since it was originally committed in r243980.

Sponsored by: Dell EMC Isilon


# ad1fc315 24-Apr-2018 Conrad Meyer <cem@FreeBSD.org>

panic: Optionally, trace secondary panics

To diagnose and fix secondary panics, it is useful to have a stack trace.
When panic tracing is enabled, optionally trace secondary panics as well.

The option is configured with the tunable/sysctl debug.trace_all_panics.

(The original concern that inspired only tracing the primary panic was
likely that the secondary trace may scroll the original panic message or trace
off the screen. This is less of a concern for serial consoles with logging.
Not everything has a serial console, though, so the behavior is optional.)

Discussed with: jhb
Sponsored by: Dell EMC Isilon


# 18959b69 24-Apr-2018 Jonathan T. Looney <jtl@FreeBSD.org>

Update r332860 by changing the default from suppressing post-panic
assertions to not suppressing post-panic assertions.

There are some post-panic assertions that are valuable and we shouldn't
default to disabling them. However, when a user trips over them, the
user can still adjust the tunable/sysctl to suppress them temporarily to
get conduct troubleshooting (e.g. get a core dump).

Reported by: cem, markj


# 44b71282 21-Apr-2018 Jonathan T. Looney <jtl@FreeBSD.org>

When running with INVARIANTS, the kernel contains extra checks. However,
these assumptions may not hold true once we've panic'd. Therefore, the
checks hold less value after a panic. Additionally, if one of the checks
fails while we are already panic'd, this creates a double-panic which can
interfere with debugging the original panic.

Therefore, this commit allows an administrator to suppress a response to
KASSERT checks after a panic by setting a tunable/sysctl. The
tunable/sysctl (debug.kassert.suppress_in_panic) defaults to being
enabled.

Reviewed by: kib
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D12920


# c3982007 22-Mar-2018 Konstantin Belousov <kib@FreeBSD.org>

Do not send signals to init directly from shutdown_nice(9), do it from
the task context.

shutdown_nice() is used from the fast interrupt handlers, mostly for
console drivers, where we cannot lock blockable locks. Schedule the
task in the fast queue to send the signal from the proper context.

Reviewed by: imp
Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# f0d847af 22-Mar-2018 Warner Losh <imp@FreeBSD.org>

Drop any recursed taking of Giant once and for all at the top of
kern_reboot(). The shutdown path is now safe to run without Giant.

Discussed with: kib@
Sponsored by: Netflix


# d5292812 21-Mar-2018 Warner Losh <imp@FreeBSD.org>

Remove Giant from init creation and vfs_mountroot.

Sponsored by: Netflix
Discussed with: kib@, mckusick@
Differential Review: https://reviews.freebsd.org/D14712


# bde3b1e1 08-Mar-2018 Mark Johnston <markj@FreeBSD.org>

Return E2BIG if we run out of space writing a compressed kernel dump.

ENOSPC causes the MD kernel dump code to retry the dump, but this is
undesirable in the case where we legitimately ran out of space.


# 6026dcd7 13-Feb-2018 Mark Johnston <markj@FreeBSD.org>

Add support for zstd-compressed user and kernel core dumps.

This works similarly to the existing gzip compression support, but
zstd is typically faster and gives better compression ratios.

Support for this functionality must be configured by adding ZSTDIO to
one's kernel configuration file. dumpon(8)'s new -Z option is used to
configure zstd compression for kernel dumps. savecore(8) now recognizes
and saves zstd-compressed kernel dumps with a .zst extension.

Submitted by: cem (original version)
Relnotes: yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D13101,
https://reviews.freebsd.org/D13633


# 78f57a9c 08-Jan-2018 Mark Johnston <markj@FreeBSD.org>

Generalize the gzio API.

We currently use a set of subroutines in kern_gzio.c to perform
compression of user and kernel core dumps. In the interest of adding
support for other compression algorithms (zstd) in this role without
complicating the API consumers, add a simple compressor API which can be
used to select an algorithm.

Also change the (non-default) GZIO kernel option to not enable
compressed user cores by default. It's not clear that such a default
would be desirable with support for multiple algorithms implemented,
and it's inconsistent in that it isn't applied to kernel dumps.

Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D13632


# 4daa09f3 23-Dec-2017 Alexander Kabaev <kan@FreeBSD.org>

Remove dead store to local variable.


# efe67753 25-Nov-2017 Nathan Whitehorn <nwhitehorn@FreeBSD.org>

Remove some, but not all, assumptions that the BSP is CPU 0 and that CPUs
are numbered densely from there to n_cpus.

MFC after: 1 month


# 51369649 20-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.


# 48f1a492 13-Nov-2017 Warner Losh <imp@FreeBSD.org>

Add two new tunables / sysctls to controll reboot after panic:

kern.poweroff_on_panic which, when enabled, instructs a system to
power off on a panic instead of a reboot.

kern.powercyle_on_panic which, when enabled, instructs a system to
power cycle, if possible, on a panic instead of a reboot.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13042


# 7d41b6f0 25-Oct-2017 Warner Losh <imp@FreeBSD.org>

Handle RB_POWERCYCLE in the MI part of the kernel

Signal init with SIGWINCH in shutdown_nice for RB_POWERCYCLE.

Sponsored by: Netflix


# 64a16434 24-Oct-2017 Mark Johnston <markj@FreeBSD.org>

Add support for compressed kernel dumps.

When using a kernel built with the GZIO config option, dumpon -z can be
used to configure gzip compression using the in-kernel copy of zlib.
This is useful on systems with large amounts of RAM, which require a
correspondingly large dump device. Recovery of compressed dumps is also
faster since fewer bytes need to be copied from the dump device.

Because we have no way of knowing the final size of a compressed dump
until it is written, the kernel will always attempt to dump when
compression is configured, regardless of the dump device size. If the
dump is aborted because we run out of space, an error is reported on
the console.

savecore(8) is modified to handle compressed dumps and save them to
vmcore.<index>.gz, as it does when given the -z option.

A new rc.conf variable, dumpon_flags, is added. Its value is added to
the boot-time dumpon(8) invocation that occurs when a dump device is
configured in rc.conf.

Reviewed by: cem (earlier version)
Discussed with: def, rgrimes
Relnotes: yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11723


# 46fcd1af 18-Oct-2017 Mark Johnston <markj@FreeBSD.org>

Move kernel dump offset tracking into MI code.

All of the kernel dump implementations keep track of the current offset
("dumplo") within the dump device. However, except for textdumps, they
all write the dump sequentially, so we can reduce code duplication by
having the MI code keep track of the current offset. The new
dump_append() API can be used to write at the current offset.

This is needed to implement support for kernel dump compression in the
MI kernel dump code.

Also simplify dump_encrypted_write() somewhat: use dump_write() instead
of duplicating its bounds checks, and get rid of the redundant offset
tracking.

Reviewed by: cem
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11722


# e9666bf6 17-Aug-2017 Mark Johnston <markj@FreeBSD.org>

Remove some unneeded subroutines for padding writes to dump devices.

Right now we only need to pad when writing kernel dump headers, so
flatten three related subroutines into one. The encrypted kernel dump
code already writes out its key in a dumper.blocksize-sized block.

No functional change intended.

Reviewed by: cem, def
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11647


# 01938d36 17-Aug-2017 Mark Johnston <markj@FreeBSD.org>

Rename mkdumpheader() and group EKCD functions in kern_shutdown.c.

This helps simplify the code in kern_shutdown.c and reduces the number
of globally visible functions.

No functional change intended.

Reviewed by: cem, def
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11603


# 50ef60da 17-Aug-2017 Mark Johnston <markj@FreeBSD.org>

Factor out duplicated kernel dump code into dump_{start,finish}().

dump_start() and dump_finish() are responsible for writing kernel dump
headers, optionally writing the key when encryption is enabled, and
initializing the initial offset into the dump device.

Also remove the unused dump_pad(), and make some functions static now that
they're only called from kern_shutdown.c.

No functional change intended.

Reviewed by: cem, def
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11584


# ab384d75 15-Jul-2017 Mark Johnston <markj@FreeBSD.org>

Revert r320918 and have mkdumpheader() handle version string truncation.

Reported by: jhb
MFC after: 1 week


# 6cf0c1db 06-Mar-2017 Gleb Smirnoff <glebius@FreeBSD.org>

Fix compilation of r314784 on 32 bit.


# f2498877 06-Mar-2017 Gleb Smirnoff <glebius@FreeBSD.org>

In panic() print current timestamp, which matches timestamp in the dump
header. This will help to correlate console server logs with dump files,
no matter how precise is clock on a console server appliance, and how
buggy the appliance is.


# b4b4b530 28-Jan-2017 Baptiste Daroussin <bapt@FreeBSD.org>

Revert crap accidentally committed


# 814aaaa7 28-Jan-2017 Baptiste Daroussin <bapt@FreeBSD.org>

Revert r312923 a better approach will be taken later


# 42d33c1f 14-Jan-2017 Mark Johnston <markj@FreeBSD.org>

Stop the scheduler upon panic even in non-SMP kernels.

This is needed for kernel dumps to work, as the panicking thread will call
into code that makes use of kernel locks.

Reported and tested by: Eugene Grosbein
MFC after: 1 week


# 480f31c2 10-Dec-2016 Konrad Witaszczyk <def@FreeBSD.org>

Add support for encrypted kernel crash dumps.

Changes include modifications in kernel crash dump routines, dumpon(8) and
savecore(8). A new tool called decryptcore(8) was added.

A new DIOCSKERNELDUMP I/O control was added to send a kernel crash dump
configuration in the diocskerneldump_arg structure to the kernel.
The old DIOCSKERNELDUMP I/O control was renamed to DIOCSKERNELDUMP_FREEBSD11 for
backward ABI compatibility.

dumpon(8) generates an one-time random symmetric key and encrypts it using
an RSA public key in capability mode. Currently only AES-256-CBC is supported
but EKCD was designed to implement support for other algorithms in the future.
The public key is chosen using the -k flag. The dumpon rc(8) script can do this
automatically during startup using the dumppubkey rc.conf(5) variable. Once the
keys are calculated dumpon sends them to the kernel via DIOCSKERNELDUMP I/O
control.

When the kernel receives the DIOCSKERNELDUMP I/O control it generates a random
IV and sets up the key schedule for the specified algorithm. Each time the
kernel tries to write a crash dump to the dump device, the IV is replaced by
a SHA-256 hash of the previous value. This is intended to make a possible
differential cryptanalysis harder since it is possible to write multiple crash
dumps without reboot by repeating the following commands:
# sysctl debug.kdb.enter=1
db> call doadump(0)
db> continue
# savecore

A kernel dump key consists of an algorithm identifier, an IV and an encrypted
symmetric key. The kernel dump key size is included in a kernel dump header.
The size is an unsigned 32-bit integer and it is aligned to a block size.
The header structure has 512 bytes to match the block size so it was required to
make a panic string 4 bytes shorter to add a new field to the header structure.
If the kernel dump key size in the header is nonzero it is assumed that the
kernel dump key is placed after the first header on the dump device and the core
dump is encrypted.

Separate functions were implemented to write the kernel dump header and the
kernel dump key as they need to be unencrypted. The dump_write function encrypts
data if the kernel was compiled with the EKCD option. Encrypted kernel textdumps
are not supported due to the way they are constructed which makes it impossible
to use the CBC mode for encryption. It should be also noted that textdumps don't
contain sensitive data by design as a user decides what information should be
dumped.

savecore(8) writes the kernel dump key to a key.# file if its size in the header
is nonzero. # is the number of the current core dump.

decryptcore(8) decrypts the core dump using a private RSA key and the kernel
dump key. This is performed by a child process in capability mode.
If the decryption was not successful the parent process removes a partially
decrypted core dump.

Description on how to encrypt crash dumps was added to the decryptcore(8),
dumpon(8), rc.conf(5) and savecore(8) manual pages.

EKCD was tested on amd64 using bhyve and i386, mipsel and sparc64 using QEMU.
The feature still has to be tested on arm and arm64 as it wasn't possible to run
FreeBSD due to the problems with QEMU emulation and lack of hardware.

Designed by: def, pjd
Reviewed by: cem, oshogbo, pjd
Partial review: delphij, emaste, jhb, kib
Approved by: pjd (mentor)
Differential Revision: https://reviews.freebsd.org/D4712


# 69a28758 15-Sep-2016 Ed Maste <emaste@FreeBSD.org>

Renumber license clauses in sys/kern to avoid skipping #3


# a0d20ecb 05-Jul-2016 Gleb Smirnoff <glebius@FreeBSD.org>

Compile in the kassert_panic() function with INVARIANT_SUPPORT
option, not INVARIANTS. The function is required if we want
to load in a module that is compiled with INVARIANTS.

Reviewed by: jhb
Approved by: re (gjb)


# 3af72c11 06-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Implement a `show panic` command to DDB which will helpfully print the
panic string again if set, in case it scrolled out of the active
window. This avoids having to remember the symbol name.

Also add a show callout <addr> command to DDB in order to inspect
some struct callout fields in case of panics in the callout code.
This may help to see if there was memory corruption or to further
ease debugging problems.

Obtained from: projects/vnet
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Reviewed by: jhb (comment only on the show panic initally)
Differential Revision: https://reviews.freebsd.org/D4527


# f7bd2217 31-May-2016 Edward Tomasz Napierala <trasz@FreeBSD.org>

Cosmetics - add missing space after ellipses in shutdown messages.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# 5dc5dab6 15-Apr-2016 Conrad Meyer <cem@FreeBSD.org>

Add 4Kn kernel dump support

(And 4Kn minidump support, but only for amd64.)

Make sure all I/O to the dump device is of the native sector size. To
that end, we keep a native sector sized buffer associated with dump
devices (di->blockbuf) and use it to pad smaller objects as needed (e.g.
kerneldumpheader).

Add dump_write_pad() as a convenience API to dump smaller objects with
zero padding. (Rather than pull in NPM leftpad, we wrote our own.)

Savecore(1) has been updated to deal with these dumps. The format for
512-byte sector dumps should remain backwards compatible.

Minidumps for other architectures are left as an exercise for the
reader.

PR: 194279
Submitted by: ambrisko@
Reviewed by: cem (earlier version), rpokala
Tested by: rpokala (4Kn/512 except 512 fulldump), cem (512 fulldump)
Relnotes: yes
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D5848


# 1f12da0e 22-Jan-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Just checkpoint the WIP in order to be able to make the tree update
easier. Note: this is currently not in a usable state as certain
teardown parts are not called and the DOMAIN rework is missing.
More to come soon and find its way to head.

Obtained from: P4 //depot/user/bz/vimage/...
Sponsored by: The FreeBSD Foundation


# 2eb0015a 01-Oct-2015 Colin Percival <cperciva@FreeBSD.org>

Disable suspend when we're shutting down. This solves the "tell FreeBSD
to shut down; close laptop lid" scenario which otherwise tended to end
with a laptop overheating or the battery dying.

The implementation uses a new sysctl, kern.suspend_blocked; init(8) sets
this while rc.suspend runs, and the ACPI sleep code ignores requests while
the sysctl is set.

Discussed on: freebsd-acpi (35 emails)
MFC after: 1 week


# 0d3d0cc3 18-Sep-2015 Edward Tomasz Napierala <trasz@FreeBSD.org>

Kernel part of reroot support - a way to change rootfs without reboot.

Note that the mountlist manipulations are somewhat fragile, and not very
pretty. The reason for this is to avoid changing vfs_mountroot(), which
is (obviously) rather mission-critical, but not very well documented,
and thus hard to test properly. It might be possible to rework it to use
its own simple root mount mechanism instead of vfs_mountroot().

Reviewed by: kib@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D2698


# 98082691 28-Jul-2015 Jeff Roberson <jeff@FreeBSD.org>

- Make 'struct buf *buf' private to vfs_bio.c. Having a global variable
'buf' is inconvenient and has lead me to some irritating to discover
bugs over the years. It also makes it more challenging to refactor
the buf allocation system.
- Move swbuf and declare it as an extern in vfs_bio.c. This is still
not perfect but better than it was before.
- Eliminate the unused ffs function that relied on knowledge of the buf
array.
- Move the shutdown code that iterates over the buf array into vfs_bio.c.

Reviewed by: kib
Sponsored by: EMC / Isilon Storage Division


# 7a9c38e6 19-May-2015 Alan Somers <asomers@FreeBSD.org>

Properly null-terminate strings in a kernel dump header. A version string
longer than 192 bytes will cause the version field of a dump header to
overflow. strncpy doesn't null terminate it, so savecore will print a
corrupted info file. Using strlcpy fixes the bug.

Differential Revision: https://reviews.freebsd.org/D2560
Reviewed by: markj
MFC after: 3 weeks
Sponsored by: Spectra Logic


# 9ad64f27 01-May-2015 Mark Johnston <markj@FreeBSD.org>

Remove a stale reference to the stop_scheduler_on_panic tunable, which
itself was removed in r243515.

MFC after: 1 week


# da10a603 23-Apr-2015 Mark Johnston <markj@FreeBSD.org>

Make vpanic() externally visible so that it can be called as part of the
DTrace panic() action.

Differential Revision: https://reviews.freebsd.org/D2349
Reviewed by: avg
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division


# bdb9ab0d 06-Jan-2015 Mark Johnston <markj@FreeBSD.org>

Factor out duplicated code from dumpsys() on each architecture into generic
code in sys/kern/kern_dump.c. Most dumpsys() implementations are nearly
identical and simply redefine a number of constants and helper subroutines;
a generic implementation will make it easier to implement features around
kernel core dumps. This change does not alter any minidump code and should
have no functional impact.

PR: 193873
Differential Revision: https://reviews.freebsd.org/D904
Submitted by: Conrad Meyer <conrad.meyer@isilon.com>
Reviewed by: jhibbits (earlier version)
Sponsored by: EMC / Isilon Storage Division


# 5ebb15b9 10-Nov-2014 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Add missing privilege check when setting the dump device. Before that change it
was possible for a regular user to setup the dump device if he had write access
to the given device. In theory it is a security issue as user might get access
to kernel's memory after provoking kernel crash, but in practise it is not
recommended to give regular users direct access to storage devices.

Rework the code so that we do privileges check within the set_dumper() function
to avoid similar problems in the future.

Discussed with: secteam


# f6b4f5ca 25-Jul-2014 Gavin Atkinson <gavin@FreeBSD.org>

Add error return to dumpsys(), and use it in doadump().

This commit does not add error returns to minidumpsys() or
textdump_dumpsys(); those can also be added later.

Submitted by: Conrad Meyer (EMC / Isilon storage division)


# af3b2549 27-Jun-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Pull in r267961 and r267973 again. Fix for issues reported will follow.


# 37a107a4 27-Jun-2014 Glen Barber <gjb@FreeBSD.org>

Revert r267961, r267973:

These changes prevent sysctl(8) from returning proper output,
such as:

1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory


# 3da1cf1e 27-Jun-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after: 2 weeks
Sponsored by: Mellanox Technologies


# 8f5b107b 07-Apr-2014 Ed Schouten <ed@FreeBSD.org>

Thinko: don't forget to apply 'howto' in case init(8) isn't running.


# 912d5937 07-Apr-2014 Ed Schouten <ed@FreeBSD.org>

Clean up shutdown_nice(). Just send the right signal to init(8).

Right now, init(8) cannot distinguish between an ACPI power button press
or a Ctrl+Alt+Del sequence on the keyboard. This is because
shutdown_nice() sends SIGINT to init(8) unconditionally, but later
modifies the arguments to reboot(2) to force a certain behaviour.

Instead of doing this, patch up the code to just forward the appropriate
signal to userspace. SIGUSR1 and SIGUSR2 can already be used to halt the
system.

While there, move waittime to the function where it's used; kern_reboot().


# 3b251028 04-Dec-2013 Colin Percival <cperciva@FreeBSD.org>

Make panic_reboot_wait_time static.

Submitted by: jhb


# 1cdbb9ed 03-Dec-2013 Colin Percival <cperciva@FreeBSD.org>

Add a new sysctl / loader tunable kern.panic_reboot_wait_time which
defaults to PANIC_REBOOT_WAIT_TIME (a long-existing kernel config
setting). Use this now-variable value in place of the defined constant
to control how long the system waits after a panic before rebooting.


# 89f6b863 08-Mar-2013 Attilio Rao <attilio@FreeBSD.org>

Switch the vm_object mutex to be a rwlock. This will enable in the
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.

The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
- VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
- VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
- VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
- VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
(in order to avoid visibility of implementation details)
- The read-mode operations are added:
VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
sys/mutex.h in consumers directly to cater its inlining functions
using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
the compat layer because the name clash between FreeBSD and solaris
versions must be avoided.
At this purpose zfs redefines the vm_object locking functions
directly, isolating the FreeBSD components in specific compat stubs.

The KPI results heavilly broken by this commit. Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).

Sponsored by: EMC / Isilon storage division
Reviewed by: jeff
Reviewed by: pjd (ZFS specific review)
Discussed with: alc
Tested by: pho


# 6b6bd3b7 10-Dec-2012 Alfred Perlstein <alfred@FreeBSD.org>

Switch the hardwired WITNESS panics to kassert_panic.

This is an ongoing effort to provide runtime debug information
useful in the field that does not panic existing installations.

This gives us the flexibility needed when shipping images to a
potentially large audience with WITNESS enabled without worrying
about formerly non-fatal LORs hurting a release.

Sponsored by: iXsystems


# a94053ba 10-Dec-2012 Alfred Perlstein <alfred@FreeBSD.org>

allow KASSERT to enter KDB.


# 3945a964 07-Dec-2012 Alfred Perlstein <alfred@FreeBSD.org>

Allow KASSERT to log instead of panic.

This is to allow debug images to be used without taking down the
system when non-fatal asserts are hit.

The following sysctls are added:

debug.kassert.warn_only: 1 = log, 0 = panic

debug.kassert.do_ktr: set to a ktr mask for logging via KTR

debug.kassert.do_log: 1 = log, 0 = quiet

debug.kassert.warnings: stats, number of kasserts hit

debug.kassert.log_panic_at:
number of kasserts before we actually panic, 0 = never

debug.kassert.log_pps_limit: pps limit for log messages

debug.kassert.log_mute_at: stop warning after N kasserts, 0 = never stop

debug.kassert.kassert: set this sysctl to trigger a kassert

Discussed with: scottl, gnn, marcel
Sponsored by: iXsystems


# 6898bee9 25-Nov-2012 Andriy Gapon <avg@FreeBSD.org>

remove stop_scheduler_on_panic knob

There has not been any complaints about the default behavior, so there
is no need to keep a knob that enables the worse alternative.

Now that the hard-stopping of other CPUs is the only behavior, the panic_cpu
spinlock-like logic can be dropped, because only a single CPU is
supposed to win stop_cpus_hard(other_cpus) race and proceed past that
call.

MFC after: 1 month


# 5a3a8ec0 02-Nov-2012 Alfred Perlstein <alfred@FreeBSD.org>

Merge 242488, better use of strlcpy.

Submitted by: Eric van Gyzen <eric@vangyzen.net>


# bad7e7f3 01-Nov-2012 Alfred Perlstein <alfred@FreeBSD.org>

Provide a device name in the sysctl tree for programs to query the
state of crashdump target devices.

This will be used to add a "-l" (ell) flag to dumpon(8) to list the
currently configured dumpdev.

Reviewed by: phk


# 7adc598a 03-Jun-2012 Andriy Gapon <avg@FreeBSD.org>

free wdog_kern_pat calls in post-panic paths from under SW_WATCHDOG

Those calls are useful with hardware watchdog drivers too.

MFC after: 3 weeks


# ac6e25ec 22-May-2012 Hartmut Brandt <harti@FreeBSD.org>

Make dumptid non-static. It is used by libkvm to detect whether
this is a VNET-kernel or not. gcc used to put the static symbol into
the symbol table, clang does not. This fixes the 'netstat: no namelist'
error seen on clang+VNET systems.


# 5d7380f8 28-Jan-2012 Attilio Rao <attilio@FreeBSD.org>

Avoid to check the same cache line/variable from all the locking
primitives by breaking stop_scheduler into a per-thread variable.
Also, store the new td_stopsched very close to td_*locks members as
they will be accessed mostly in the same codepaths as td_stopsched and
this results in avoiding a further cache-line pollution, possibly.

STOP_SCHEDULER() was pondered to use a new 'thread' argument, in order to
take advantage of already cached curthread, but in the end there should
not really be a performance benefit, while introducing a KPI breakage.

In collabouration with: flo
Reviewed by: avg
MFC after: 3 months (or never)
X-MFC: r228424


# 90d82653 08-Jan-2012 Andriy Gapon <avg@FreeBSD.org>

enable stop_scheduler_on_panic by default

My plan is to make this behavior unconditional before 10.0 release.

X-MFC after: r228424 (if ever)


# bf8696b4 17-Dec-2011 Andriy Gapon <avg@FreeBSD.org>

introduce cngrab/cnungrab stub calls in some places where they make sense

MFC after: 2 months


# 1c5151f3 13-Dec-2011 David E. O'Brien <obrien@FreeBSD.org>

Match other formatting.


# 3d7618d8 13-Dec-2011 David E. O'Brien <obrien@FreeBSD.org>

Disallow various debug.kdb sysctl's when securelevel is raised.

PR: 161350


# 3eb9ab52 12-Dec-2011 Eitan Adler <eadler@FreeBSD.org>

Document a large number of currently undocumented sysctls. While here
fix some style(9) issues and reduce redundancy.

PR: kern/155491
PR: kern/155490
PR: kern/155489
Submitted by: Galimov Albert <wtfcrap@mail.ru>
Approved by: bde
Reviewed by: jhb
MFC after: 1 week


# 35370593 11-Dec-2011 Andriy Gapon <avg@FreeBSD.org>

panic: add a switch and infrastructure for stopping other CPUs in SMP case

Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.

Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state

Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.

This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.

The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.

PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)


# 6472ac3d 07-Nov-2011 Ed Schouten <ed@FreeBSD.org>

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


# 8451d0dd 16-Sep-2011 Kip Macy <kmacy@FreeBSD.org>

In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by: rwatson
Approved by: re (bz)


# 58379067 12-Sep-2011 Attilio Rao <attilio@FreeBSD.org>

dump_write() returns ENXIO if the dump is trying to be written outside
of the device boundry.
While this is generally ok, the problem is that all the consumers
handle similar cases (and expect to catch) ENOSPC for this (for a
reference look at minidumpsys() and dumpsys() constructions). That
ends up in consumers not recognizing the issue and amd64 failing to
retry if the number of pages grows up during minidump.
Fix this by returning ENOSPC in dump_write() and while here add some
more diagnostic on involved values.

Sponsored by: Sandvine Incorporated
In collabouration with: emaste
Approved by: re (kib)
MFC after: 10 days


# fa2b39a1 07-Sep-2011 Attilio Rao <attilio@FreeBSD.org>

Improve the informations reported in case of busy buffers during the shutdown:
- Axe out the SHOW_BUSYBUFS option and uses a tunable for selectively
enable/disable it, which is defaulted for not printing anything (0
value) but can be changed for printing (1 value) and be verbose (2
value)
- Improves the informations outputed: right now, there is no track of
the actual struct buf object or vnode which are referenced by the
shutdown process, but it is printed the related struct bufobj object
which is not really helpful
- Add more verbosity about the state of the struct buf lock and the
vnode informations, with the latter to be activated separately by the
sysctl

Sponsored by: Sandvine Incorporated
Reviewed by: emaste, kib
Approved by: re (ksmith)
MFC after: 10 days


# 7a0b13ed 25-Jul-2011 Andriy Gapon <avg@FreeBSD.org>

remove RESTARTABLE_PANICS option

This is done per request/suggestion from John Baldwin
who introduced the option. Trying to resume normal
system operation after a panic is very unpredictable
and dangerous. It will become even more dangerous
when we allow a thread in panic(9) to penetrate all
lock contexts.
I understand that the only purpose of this option was
for testing scenarios potentially resulting in panic.

Suggested by: jhb
Reviewed by: attilio, jhb
X-MFC-After: never
Approved by: re (kib)


# e3adb685 08-Jun-2011 Attilio Rao <attilio@FreeBSD.org>

In the current code, a double panic condition may lead to dumps
interleaving.
Signal dumping to happen only for the first panic which should be the
most important.

Sponsored by: Sandvine Incorporated
Submitted by: Nima Misaghian (nmisaghian AT sandvine DOT com)
MFC after: 2 weeks


# 299cceef 06-Jun-2011 Marcel Moolenaar <marcel@FreeBSD.org>

Fix making kernel dumps from the debugger by creating a command
for it. Do not not expect a developer to call doadump(). Calling
doadump does not necessarily work when it's declared static. Nor
does it necessarily do what was intended in the context of text
dumps. The dump command always creates a core dump.

Move printing of error messages from doadump to the dump command,
now that we don't have to worry about being called from DDB.


# 2be767e0 28-Apr-2011 Attilio Rao <attilio@FreeBSD.org>

Add the watchdogs patting during the (shutdown time) disk syncing and
disk dumping.
With the option SW_WATCHDOG on, these operations are doomed to let
watchdog fire, fi they take too long.

I implemented the stubs this way because I really want wdog_kern_*
KPI to not be dependant by SW_WATCHDOG being on (and really, the option
only enables watchdog activation in hardclock) and also avoid to
call them when not necessary (avoiding not-volountary watchdog
activations).

Sponsored by: Sandvine Incorporated
Discussed with: emaste, des
MFC after: 2 weeks


# fd104c15 24-Oct-2010 Rebecca Cran <brucec@FreeBSD.org>

Mostly revert r203420, and add similar functionality into ada(4) since the
existing code caused problems with some SCSI controllers.

A new sysctl kern.cam.ada.spindown_shutdown has been added that controls
whether or not to spin-down disks when shutting down.
Spinning down the disks unloads/parks the heads - this is
much better than removing power when the disk is still
spinning because otherwise an Emergency Unload occurs which may cause damage
to the actuator.

PR: kern/140752
Submitted by: olli
Reviewed by: arundel
Discussed with: mav
MFC after: 2 weeks


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 76e18b25 17-Oct-2010 Marcel Moolenaar <marcel@FreeBSD.org>

Rename boot() to kern_reboot() and make it visible outside of
kern_shutdown.c. This makes it easier for emulators and other
parts of the kernel to initiate a reboot.


# 64dd590e 09-Oct-2010 Andriy Gapon <avg@FreeBSD.org>

panic_cpu variable should be volatile

This is to prevent caching of its value in a register when it is checked
and modified by multiple CPUs in parallel.
Also, move the variable into the scope of the only function that uses it.

Reviewed by: jhb
Hint from: mdf
MFC after: 1 week


# 08a9c205 01-Oct-2010 Andriy Gapon <avg@FreeBSD.org>

sysctls in kern_shutdown: add twin tunables

also make couple of sysctl-controlled variables static

Reviewed by: rwatson
MFC after: 1 week


# 7b7fb491 13-May-2010 Attilio Rao <attilio@FreeBSD.org>

MFC r206878, r206897, r207921:
Fix a deadlock in the shutdown code when some CPUs are performing
smp_rendezvous() (or smp_tlb_shootdown()) and are waiting for
acknowledgment.


# 0a2d5fea 19-Apr-2010 Attilio Rao <attilio@FreeBSD.org>

Fix compilation in the !SMP case.
Keep the interrupts disabled in order to avoid preemption problems.

Reported by: tinderbox, b.f. <bf1783 at googlemail dot com>
MFC: 2 weeks
X-MFC: r206878


# 248bb937 19-Apr-2010 Attilio Rao <attilio@FreeBSD.org>

Fix a deadlock in the shutdown code:
When performing a smp_rendezvous() or more likely, on amd64 and i386,
a smp_tlb_shootdown() the caller will end up with the smp_ipi_mtx
spinlock held, busy-waiting for other CPUs to acknowledge the operation.
As long as CPUs are suspended (via cpu_reset()) between the active mask
read and IPI sending there can be a deadlock where the caller will wait
forever for a dead CPU to acknowledge the operation.
Please note that on CPU0 that is going to be someway heavier because of
the spinlocks being disabled earlier than quitting the machine.

Fix this bug by calling cpu_reset() with the smp_ipi_mtx held.
Note that it is very likely that a saner offline/online CPUs mechanism
will help heavilly in fixing similar cases as it is likely more bugs
of this type may arise in the future.

Reported by: rwatson
Discussed with: jhb
Tested by: rnoland, Giovanni Trematerra
<giovanni dot trematerra at gmail dot com>
MFC: 2 weeks

Special deciation to: anyone who made possible to have 16-ways machines
in Netperf


# 6f15a274 03-Feb-2010 Alexander Motin <mav@FreeBSD.org>

MFp4:
Make CAM to stop all attached devices on system shutdown.
It allows devices to park heads, reducing stress on power loss.
Add `kern.cam.power_down` tunable and sysctl to controll it.


# 46b24831 08-Jan-2010 Warner Losh <imp@FreeBSD.org>

Revert r199758. It pointed out that we were calling pcpu_init way too
late...


# 0ff75e4f 24-Nov-2009 Warner Losh <imp@FreeBSD.org>

Only all critical_enter()/critical_exit() if curthread has been set.
Otherwise we dereference a null pointer and can't get useful panic
info early in boot.


# 4f9d48e4 23-Oct-2009 John Baldwin <jhb@FreeBSD.org>

Don't bother copying the name of a kproc or kthread out into a temporary
array just to pass that array to printf(). kproc and kthread names are
NUL-terminated and can be printed using printf() directly.

Reviewed by: bde


# b22692bd 10-Sep-2009 Nick Hibma <n_hibma@FreeBSD.org>

Add a comment on the consequences of reducing the poweroff delay


# be105717 13-Aug-2009 Attilio Rao <attilio@FreeBSD.org>

MFC r196196:

* Completely remove the option STOP_NMI from the kernel. This option
has proven to have a good effect when entering KDB by using a NMI,
but it completely violates all the good rules about interrupts
disabled while holding a spinlock in other occasions. This can be the
cause of deadlocks on events where a normal IPI_STOP is expected.
* Add an new IPI called IPI_STOP_HARD on all the supported architectures.
This IPI is responsible for sending a stop message among CPUs using a
privileged channel when disponible. In other cases it just does match a
normal IPI_STOP.
Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64
architectures, while on the other has a normal IPI_STOP effect. It is
responsibility of maintainers to eventually implement an hard stop
when necessary and possible.
* Use the new IPI facility in order to implement a new userend SMP kernel
function called stop_cpus_hard(). That is specular to stop_cpu() but
it does use the privileged channel for the stopping facility.
* Let KDB use the newly introduced function stop_cpus_hard() and leave
stop_cpus() for all the other cases
* Disable interrupts on CPU0 when starting the process of APs suspension.
* Style cleanup and comments adding

This patch should fix the reboot/shutdown deadlocks many users are
constantly reporting on mailing lists.

Please don't forget to update your config file with the STOP_NMI
option removal

Reviewed by: jhb
Tested by: pho, bz, rink
Approved by: re (kib)


# dc6fbf65 13-Aug-2009 Attilio Rao <attilio@FreeBSD.org>

* Completely Remove the option STOP_NMI from the kernel. This option
has proven to have a good effect when entering KDB by using a NMI,
but it completely violates all the good rules about interrupts
disabled while holding a spinlock in other occasions. This can be the
cause of deadlocks on events where a normal IPI_STOP is expected.
* Adds an new IPI called IPI_STOP_HARD on all the supported architectures.
This IPI is responsible for sending a stop message among CPUs using a
privileged channel when disponible. In other cases it just does match a
normal IPI_STOP.
Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64
architectures, while on the other has a normal IPI_STOP effect. It is
responsibility of maintainers to eventually implement an hard stop
when necessary and possible.
* Use the new IPI facility in order to implement a new userend SMP kernel
function called stop_cpus_hard(). That is specular to stop_cpu() but
it does use the privileged channel for the stopping facility.
* Let KDB use the newly introduced function stop_cpus_hard() and leave
stop_cpus() for all the other cases
* Disable interrupts on CPU0 when starting the process of APs suspension.
* Style cleanup and comments adding

This patch should fix the reboot/shutdown deadlocks many users are
constantly reporting on mailing lists.

Please don't forget to update your config file with the STOP_NMI
option removal

Reviewed by: jhb
Tested by: pho, bz, rink
Approved by: re (kib)


# c1f19219 13-Jun-2009 Jamie Gritton <jamie@FreeBSD.org>

Rename the host-related prison fields to be the same as the host.*
parameters they represent, and the variables they replaced, instead of
abbreviated versions of them.

Approved by: bz (mentor)


# bcf11e8d 05-Jun-2009 Robert Watson <rwatson@FreeBSD.org>

Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.

Discussed with: pjd


# 76ca6f88 29-May-2009 Jamie Gritton <jamie@FreeBSD.org>

Place hostnames and similar information fully under the prison system.
The system hostname is now stored in prison0, and the global variable
"hostname" has been removed, as has the hostname_mtx mutex. Jails may
have their own host information, or they may inherit it from the
parent/system. The proper way to read the hostname is via
getcredhostname(), which will copy either the hostname associated with
the passed cred, or the system hostname if you pass NULL. The system
hostname can still be accessed directly (and without locking) at
prison0.pr_host, but that should be avoided where possible.

The "similar information" referred to is domainname, hostid, and
hostuuid, which have also become prison parameters and had their
associated global variables removed.

Approved by: bz (mentor)


# 27457a80 03-Apr-2009 Marcel Moolenaar <marcel@FreeBSD.org>

PowerPC, meet kernel core dumps. The support is based
on a generic dumper that creates an ELF core file and
uses PMAP functions to scan and iterate over memory
chunks, as well as handle memory mappings used during
dumping.
the PMAP layer can choose to return physical memory
chunks or virtual memory chunks. For minidumps, the
chunks should be virtual.

The default MMU I/F implementation for the scan_md()
method returns NULL. Thus, when a PMAP implementation
does not implement the required methods, an empty
core file is created. Here, empty means having an ELF
header only.

Obtained from: Juniper Networks


# 27d68f90 23-Nov-2008 David Malone <dwmalone@FreeBSD.org>

It's possible that the dump device has gone away after it was
configured, change the message to let people know this is a
possibility. I've slightly changed the message from the one
submitted by Pekka to keep the printf on one line.

Submitted by: Pekka Savola <pekkas@netcore.fi>


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# e6592ee5 01-Oct-2008 Peter Wemm <peter@FreeBSD.org>

Collect N identical (or near identical) mkdumpheader() implementations into
one, as threatened in the comment. Textdump magic can be passed in.


# 41a4e90e 27-Sep-2008 Konstantin Belousov <kib@FreeBSD.org>

If the panic thread is preempted after setting panicstr but before
setting TDF_INPANIC then it will never be rescheduled again. Wrap
setting the panic condition with the critical section.

Noted and reviewed by: tegge
MFC after: 1 week


# 237fdd78 16-Mar-2008 Robert Watson <rwatson@FreeBSD.org>

In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation. This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after: 1 month
Discussed with: imp, rink


# 9e473363 04-Mar-2008 Ruslan Ermilov <ru@FreeBSD.org>

Make it possible to continue working after calling doadump()
manually from debugger. (This got broken in rev. 1.122.)


# 007b1b7b 28-Jan-2008 Ruslan Ermilov <ru@FreeBSD.org>

Add a wrapper function that bound checks writes to the dump device.


# d638e093 19-Jan-2008 Attilio Rao <attilio@FreeBSD.org>

- Introduce the function lockmgr_recursed() which returns true if the
lockmgr lkp, when held in exclusive mode, is recursed
- Introduce the function BUF_RECURSED() which does the same for bufobj
locks based on the top of lockmgr_recursed()
- Introduce the function BUF_ISLOCKED() which works like the counterpart
VOP_ISLOCKED(9), showing the state of lockmgr linked with the bufobj

BUF_RECURSED() and BUF_ISLOCKED() entirely replace the usage of bogus
BUF_REFCNT() in a more explicative and SMP-compliant way.
This allows us to axe out BUF_REFCNT() and leaving the function
lockcount() totally unused in our stock kernel. Further commits will
axe lockcount() as well as part of lockmgr() cleanup.

KPI results, obviously, broken so further commits will update manpages
and freebsd version.

Tested by: kris (on UFS and NFS)


# 618c7db3 26-Dec-2007 Robert Watson <rwatson@FreeBSD.org>

Add textdump(4) facility, which provides an alternative form of kernel
dump using mechanically generated/extracted debugging output rather than
a simple memory dump. Current sources of debugging output are:

- DDB output capture buffer, if there is captured output to save
- Kernel message buffer
- Kernel configuration, if included in kernel
- Kernel version string
- Panic message

Textdumps are stored in swap/dump partitions as with regular dumps, but
are laid out as ustar files in order to allow multiple parts to be stored
as a stream of sequentially written blocks. Blocks are written out in
reverse order, as the size of a textdump isn't known a priori. As with
regular dumps, they will be extracted using savecore(8).

One new DDB(4) command is added, "textdump", which accepts "set",
"unset", and "status" arguments. By default, normal kernel dumps are
generated unless "textdump set" is run in order to schedule a textdump.
It can be canceled using "textdump unset" to restore generation of a
normal kernel dump.

Several sysctls exist to configure aspects of textdumps;
debug.ddb.textdump.pending can be set to check whether a textdump is
pending, or set/unset in order to control whether the next kernel dump
will be a textdump from userspace.

While textdumps don't have to be generated as a result of a DDB script
run automatically as part of a kernel panic, this is a particular useful
way to use them, as instead of generating a complete memory dump, a
simple transcript of an automated DDB session can be captured using the
DDB output capture and textdump facilities. This can be used to
generate quite brief kernel bug reports rich in debugging information
but not dependent on kernel symbol tables or precisely synchronized
source code. Most textdumps I generate are less than 100k including
the full message buffer. Using textdumps with an interactive debugging
session is also useful, with capture being enabled/disabled in order to
record some but not all of the DDB session.

MFC after: 3 months


# 3de213cc 25-Dec-2007 Robert Watson <rwatson@FreeBSD.org>

Add a new 'why' argument to kdb_enter(), and a set of constants to use
for that argument. This will allow DDB to detect the broad category of
reason why the debugger has been entered, which it can use for the
purposes of deciding which DDB script to run.

Assign approximate why values to all current consumers of the
kdb_enter() interface.


# 7ab24ea3 26-Oct-2007 Julian Elischer <julian@FreeBSD.org>

Introduce a way to make pure kernal threads.
kthread_add() takes the same parameters as the old kthread_create()
plus a pointer to a process structure, and adds a kernel thread
to that process.

kproc_kthread_add() takes the parameters for kthread_add,
plus a process name and a pointer to a pointer to a process instead of just
a pointer, and if the proc * is NULL, it creates the process to the
specifications required, before adding the thread to it.

All other old kthread_xxx() calls return, but act on (struct thread *)
instead of (struct proc *). One reason to change the name is so that
any old kernel modules that are lying around and expect kthread_create()
to make a process will not just accidentally link.

fix top to show kernel threads by their thread name in -SH mode
add a tdnam formatting option to ps to show thread names.

make all idle threads actual kthreads and put them into their own idled process.
make all interrupt threads kthreads and put them in an interd process
(mainly for aesthetic and accounting reasons)
rename proc 0 to be 'kernel' and it's swapper thread is now 'swapper'

man page fixes to follow.


# 30d239bc 24-Oct-2007 Robert Watson <rwatson@FreeBSD.org>

Merge first in a series of TrustedBSD MAC Framework KPI changes
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:

mac_<object>_<method/action>
mac_<object>_check_<method/action>

The previous naming scheme was inconsistent and mostly
reversed from the new scheme. Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier. Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods. Also simplify, slightly,
some entry point names.

All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.

Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer


# 3745c395 20-Oct-2007 Julian Elischer <julian@FreeBSD.org>

Rename the kthread_xxx (e.g. kthread_create()) calls
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.

I'd LOVE to do this rename in 7.0 so that we can eventually MFC the
new kthread_xxx() calls.


# 982d11f8 04-Jun-2007 Jeff Roberson <jeff@FreeBSD.org>

Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
sychronization.
- Use the per-process spinlock rather than the sched_lock for per-process
scheduling synchronization.

Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)


# 0c14ff0e 04-Mar-2007 Robert Watson <rwatson@FreeBSD.org>

Remove 'MPSAFE' annotations from the comments above most system calls: all
system calls now enter without Giant held, and then in some cases, acquire
Giant explicitly.

Remove a number of other MPSAFE annotations in the credential code and
tweak one or two other adjacent comments.


# acd3428b 06-Nov-2006 Robert Watson <rwatson@FreeBSD.org>

Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges. These may
require some future tweaking.

Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>


# aed55708 22-Oct-2006 Robert Watson <rwatson@FreeBSD.org>

Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from: TrustedBSD Project
Sponsored by: SPARTA


# 0909f38a 10-Apr-2006 Pawel Jakub Dawidek <pjd@FreeBSD.org>

On shutdown try to turn off all swap devices. This way GEOM providers are
properly closed on shutdown.

Requested by: ru
Reviewed by: alc
MFC after: 2 weeks


# 36a52c3c 06-Feb-2006 Jeff Roberson <jeff@FreeBSD.org>

- Add the global 'rebooting' variable that is used to detect when
boot() has been called.

Sponsored by: Isilon Systems, Inc.
MFC After: 1 week


# 3fafa27b 22-Sep-2005 Stephan Uphoff <ups@FreeBSD.org>

Don't pretend to be thread0 when calling sync().
It confuses the lock manager since in some places thread0 is
then used for vnode locking while curthread is used for vnode unlocking.

Found by: Yahoo!
Reviewed by: ps@,jhb@
MFC after: 3 days


# d07f87a2 08-Sep-2005 Don Lewis <truckman@FreeBSD.org>

Add a new struct buf flag bit, B_PERSISTENT, and use it to tag
struct bufs that are persistently held by ext2fs. Ignore any buffers
with this flag in the code in boot() that counts "busy" and dirty
buffers and attempts to sync the dirty buffers, which is done before
attempting to unmount all the file systems during shutdown.

This fixes the problem caused by any ext2fs file systems that are
mounted at system shutdown time, which caused boot() to give up on
a non-zero number of buffers and skip the call to vfs_unmountall().
This left all the mounted file systems in a dirty state and caused
them to all require cleanup by fsck on reboot.

Move the two separate copies of the "busy" buffer test in boot()
to a separate function.

Nuke the useless spl() stuff in the ext2fs ULCK_BUF() macro.

Bring the PRINT_BUF_FLAGS definition in sys/buf.h up to date with
this and previous flag changes.

PR: kern/56675, kern/85163
Tested by: "Matthias Andree" matthias.andree at gmx.de
Reviewed by: bde
MFC after: 3 days


# 0b581232 11-Apr-2005 Jeff Roberson <jeff@FreeBSD.org>

- Remove unused include.


# 2fd32b93 29-Nov-2004 Nate Lawson <njl@FreeBSD.org>

Replace a printf with a KASSERT that we are indeed running on the BSP.


# f7ebc7ce 07-Nov-2004 Marcel Moolenaar <marcel@FreeBSD.org>

Bind to cpu0 for boot() processing on all platforms again.


# 70ce93f4 06-Nov-2004 Nate Lawson <njl@FreeBSD.org>

Add comments to clarify why we need to run shutdown code on the BSP, update
an old comment about boot() being MI, and note that splhigh() no longer
disables interrupts.


# 0de3e728 05-Nov-2004 Peter Wemm <peter@FreeBSD.org>

Restrict the sched_bind to cpu 0 to i386 and amd64 for now. I forgot that
alpha still doesn't use logical cpu id's.


# 20e25d7d 05-Nov-2004 Peter Wemm <peter@FreeBSD.org>

Bind to cpu0 for boot() processing. (Note this is reboot, not startup)
This means we'll always call the event hooks, device_shutdown etc on the
BSP and theoretically means we can de-cruftify the cpu_reset_proxy stuff.


# c5690651 04-Nov-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove buf->b_dev field.


# 37abb77f 04-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Change the perfectly precise message
printf("No buffers busy after final sync");
to
printf("All buffers synced.");
in order to not leave the users wondering if there should be.


# 9923b511 02-Sep-2004 Scott Long <scottl@FreeBSD.org>

Turn PREEMPTION into a kernel option. Make sure that it's defined if
FULL_PREEMPTION is defined. Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c). This
is a possible MT5 candidate.


# 0eac4495 29-Aug-2004 Dag-Erling Smørgrav <des@FreeBSD.org>

Remove the HW_WDOG option; it serves no purpose.

MFC after: 3 days


# 55c45354 20-Aug-2004 John Baldwin <jhb@FreeBSD.org>

Remove some dead code under a straggling APIC_IO #ifdef that I missed
back before 5.2.


# b6915bdb 15-Aug-2004 Don Lewis <truckman@FreeBSD.org>

Yet another tweak to the shutdown messages in boot():

Don't count busy buffers before the initial call to sync() and
don't skip the initial sync() if no busy buffers were called.
Always call sync() at least once if syncing is requested. This
defers the "Syncing disks, buffers remaining..." message until
after the initial sync() call and the first count of busy
buffers. This backs out changes in kern_shutdown 1.162.

Print a different message when there are no busy buffers after the
initial sync(), which is now the expected situation.

Print an additional message when syncing has completed successfully
in the unusual situation where the work of syncing was done by
boot().

Uppercase one message to make it consistent with all of the other
kernel shutdown messages.

Discussed with: bde (in a much earlier form, prior to 1.162)
Reviewed by: njl (in an earlier form)


# c8c216d5 09-Aug-2004 Nate Lawson <njl@FreeBSD.org>

Skip the syncing disks loop if there are no dirty buffers. Remove a
variable used to flag the initial printf.

Submitted by: truckman (earlier version)


# b1c81391 29-Jul-2004 Nate Lawson <njl@FreeBSD.org>

Minor message cleanup.


# 46e38ce8 21-Jul-2004 Robert Watson <rwatson@FreeBSD.org>

Don't sync the file system on panic by default. This seems to basically
work very infrequently, and often results in a compound panic which
confuses debugging; locking/SMP have made the layering violation (and
risks) of this more obvious over time.

Discussed with: green, bde, et al.


# 3a63b92c 19-Jul-2004 Julian Elischer <julian@FreeBSD.org>

You always spot the typos after you have committed.. Start sentence
with a Cap.


# f6449d9d 19-Jul-2004 Julian Elischer <julian@FreeBSD.org>

Allow the user who calls doadump() from the kernel debugger
to not get a page fault if he has not defined a dump device.
Panic can often not do a dump as it can hang forever in some cases.
The original PR was for amd64 only. This is a generalised version of
that change.

PR: amd64/67712
Submitted by: wjw@withagen.nl <Willen Jan Withagen>


# bb5faea3 15-Jul-2004 Alfred Perlstein <alfred@FreeBSD.org>

Cleanup shutdown output.


# da6303ba 14-Jul-2004 Alfred Perlstein <alfred@FreeBSD.org>

Tidy up system shutdown.


# 8916adb1 14-Jul-2004 Nate Lawson <njl@FreeBSD.org>

Clean up the output on reboot by keeping completion messages on the same
line as the announcement. Someone should probably update the "buffers
remaining" message since we now no longer should have any buffers remaining
at that point.


# 2d50560a 10-Jul-2004 Marcel Moolenaar <marcel@FreeBSD.org>

Update for the KDB framework:
o Make debugging code conditional upon KDB instead of DDB.
o Call kdb_enter() instead of Debugger().
o Call kdb_backtrace() instead of db_print_backtrace() or backtrace().

kern_mutex.c:
o Replace checks for db_active with checks for kdb_active and make
them unconditional.

kern_shutdown.c:
o s/DDB_UNATTENDED/KDB_UNATTENDED/g
o s/DDB_TRACE/KDB_TRACE/g
o Save the TID of the thread doing the kernel dump so the debugger
knows which thread to select as the current when debugging the
kernel core file.
o Clear kdb_active instead of db_active and do so unconditionally.
o Remove backtrace() implementation.

kern_synch.c:
o Call kdb_reenter() instead of db_error().


# 0c0b25ae 02-Jul-2004 John Baldwin <jhb@FreeBSD.org>

Implement preemption of kernel threads natively in the scheduler rather
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
determine if a thread about to be added to a run queue should be
preempted to directly. If it is not safe to preempt or if the new
thread does not have a high enough priority, then the function returns
false and sched_add() adds the thread to the run queue. If the thread
should be preempted to but the current thread is in a nested critical
section, then the flag TDF_OWEPREEMPT is set and the thread is added
to the run queue. Otherwise, mi_switch() is called immediately and the
thread is never added to the run queue since it is switch to directly.
When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
setrunqueue() now does all the correct work. This also removes the
do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
chance to run if the architecture supports native preemption since
the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
preemption, namely alpha, i386, and amd64.

This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.

Approved by: scottl (with his re@ hat)


# bf0acc27 02-Jul-2004 John Baldwin <jhb@FreeBSD.org>

- Change mi_switch() and sched_switch() to accept an optional thread to
switch to. If a non-NULL thread pointer is passed in, then the CPU will
switch to that thread directly rather than calling choosethread() to pick
a thread to choose to.
- Make sched_switch() aware of idle threads and know to do
TD_SET_CAN_RUN() instead of sticking them on the run queue rather than
requiring all callers of mi_switch() to know to do this if they can be
called from an idlethread.
- Move constants for arguments to mi_switch() and thread_single() out of
the middle of the function prototypes and up above into their own
section.


# f3732fd1 17-Jun-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Second half of the dev_t cleanup.

The big lines are:
NODEV -> NULL
NOUDEV -> NODEV
udev_t -> dev_t
udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.


# 9a6dc4b6 06-Jun-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove filename+line number from panic messages.


# 7f8a436f 05-Apr-2004 Warner Losh <imp@FreeBSD.org>

Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core


# 29bcc451 24-Jan-2004 Jeff Roberson <jeff@FreeBSD.org>

- Add a flags parameter to mi_switch. The value of flags may be SW_VOL or
SW_INVOL. Assert that one of these is set in mi_switch() and propery
adjust the rusage statistics. This is to simplify the large number of
users of this interface which were previously all required to adjust the
proper counter prior to calling mi_switch(). This also facilitates more
switch and locking optimizations.
- Change all callers of mi_switch() to pass the appropriate paramter and
remove direct references to the process statistics.


# 50d23be1 19-Jan-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Add linenumber and source filename to panic(9) output.

Ideally a traceback should be printed too, any takers ?


# 26502503 16-Aug-2003 Marcel Moolenaar <marcel@FreeBSD.org>

Further cleanup <machine/cpu.h> and <machine/md_var.h>: move the MI
prototypes of cpu_halt(), cpu_reset() and swi_vm() from md_var.h to
cpu.h. This affects db_command.c and kern_shutdown.c.

ia64: move all MD prototypes from cpu.h to md_var.h. This affects
madt.c, interrupt.c and mp_machdep.c. Remove is_physical_memory().
It's not used (vm_machdep.c).

alpha: the MD prototypes have been left in cpu.h with a comment
that they should be there. Moving them is left for later. It was
expected that the impact would be significant enough to be done in
a seperate commit.

powerpc: MD prototypes left in cpu.h. Comment added.

Suggested by: bde
Tested with: make universe (pc98 incomplete)


# 4f1b4577 15-Jun-2003 Ian Dowse <iedowse@FreeBSD.org>

Don't overwrite the static panicstr buffer for secondary and further
panics. Before revision 1.38, we used to just point panicstr at the
format string if panicstr was NULL, but since we now use a static
buffer for the formatted panic message, we have to be careful to
only write to it during the first panic.

Pointed out by: bde


# 677b542e 10-Jun-2003 David E. O'Brien <obrien@FreeBSD.org>

Use __FBSDID().


# f385f715 17-Apr-2003 John Baldwin <jhb@FreeBSD.org>

Lock the sched_lock while setting TDF_INPANIC.


# a3007012 16-Apr-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Don't include <sys/disklabel.h>


# e95499bd 13-Feb-2003 Alfred Perlstein <alfred@FreeBSD.org>

style.


# 891e0668 12-Feb-2003 Peter Wemm <peter@FreeBSD.org>

Print "Stack backtrace:" right before dumping the backtrace. We cannot
expect end users to automatically recognize a stack trace for what it is.


# 05e393f0 09-Feb-2003 Jeff Roberson <jeff@FreeBSD.org>

- Update a printf format for b_flags.


# 3c3871e5 04-Jan-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Introduce the
void backtrace(void);
function which will print a backtrace if DDB is in the kernel and an
explanation if not.

This is useful for recording backtraces in non-fatal circumstances and
does not require pollution with DDB #includes in the files where it
is used.

It would of course be nice to have a non-DDB dependent version too,
but since the meat of a backtrace is MD it is probably not worth it.


# ec63e12a 17-Nov-2002 Alfred Perlstein <alfred@FreeBSD.org>

During shutdown explain what the numbers following the 'syncing
disks' message mean, specifically, 'buffers remaining...'.


# a2ecb9b7 27-Oct-2002 Robert Watson <rwatson@FreeBSD.org>

Hook up mac_check_system_reboot(), a MAC Framework entry point that
permits MAC modules to augment system security decisions regarding
the reboot() system call, if MAC is compiled into the kernel.

Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories


# e381d245 20-Oct-2002 Thomas Moestl <tmm@FreeBSD.org>

Add kernel dump support, based on the ia64 version (which was committed
as sparc64/sparc64/dump_machdep.c a while back).
Other than ia64 (which uses ELF), sparc64 uses a homegrown format for
the dumps (headers are required because the physical address and size of
the tsb must be noted, and because physical memory may be discontiguous);
ELF would not offer any advantages here.

Reviewed by: jake


# e485b64b 19-Sep-2002 John Baldwin <jhb@FreeBSD.org>

Add ability to dump stacktraces on kernel panics when DDB is compiled into
the kernel. By default this is turned off since otherwise it could scroll
valuable panic messages off of the screen. This option can be turned on
by the DDB_TRACE kernel option as well as the debug.trace_on_panic sysctl.

Also, fix the DDB_UNATTENDED option to use its own header instead of
abusing opt_ddb.h. This way turning that one option on or off doesn't
force you to recompile all of ddb.

Requested by: many (1), bde (2*)

* - I know bde prefers !abusing option headers in general but can't
remember if he as brought up this specific case.


# 0711ca46 01-Aug-2002 John Baldwin <jhb@FreeBSD.org>

Revert previous revision which was accidentally committed and has not been
tested yet.


# fbd140c7 01-Aug-2002 John Baldwin <jhb@FreeBSD.org>

If we fail to write to a vnode during a ktrace write, then we drop all
other references to that vnode as a trace vnode in other processes as well
as in any pending requests on the todo list. Thus, it is possible for a
ktrace request structure to have a NULL ktr_vp when it is destroyed in
ktr_freerequest(). We shouldn't call vrele() on the vnode in that case.

Reported by: bde


# fe799533 16-Jul-2002 Andrew Gallatin <gallatin@FreeBSD.org>

Allow alphas to do crashdumps: Refuse to run anything in choosethread()
after a panic which is not an interrupt thread, or the thread which
caused the panic. Also, remove panicstr checks from msleep() and from
cv_wait() in order to allow threads to go to sleep and yeild the cpu
to the panicing thread, or to an interrupt thread which might
be doing the crashdump.

Reviewed by: jhb (and it was mostly his idea too)


# eb80408c 11-Jul-2002 John Baldwin <jhb@FreeBSD.org>

Add a missing newline during panic printf's for SMP systems that don't
have APICS. (Like all the !i386 archs).


# e602ba25 29-Jun-2002 Julian Elischer <julian@FreeBSD.org>

Part 1 of KSE-III

The ability to schedule multiple threads per process
(one one cpu) by making ALL system calls optionally asynchronous.
to come: ia64 and power-pc patches, patches for gdb, test program (in tools)

Reviewed by: Almost everyone who counts
(at various times, peter, jhb, matt, alfred, mini, bernd,
and a cast of thousands)

NOTE: this is still Beta code, and contains lots of debugging stuff.
expect slight instability in signals..


# 882c6b1e 12-May-2002 Marcel Moolenaar <marcel@FreeBSD.org>

Fix alpha build. The alpha has dumpsys implemented.
While here, revert the condition to list the machines
for which dumpsys has not been implemented.

Reported by: wilko


# d39e457b 08-Apr-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Put back dumppcb, but this time we put a comment to tell what it is for.

Brucifixion by: bde


# d7ef6277 05-Apr-2002 Yoshihiro Takahashi <nyan@FreeBSD.org>

Added the new kernel dumping support for pc98.


# 79024518 02-Apr-2002 Marcel Moolenaar <marcel@FreeBSD.org>

Don't compile the dummy dumpsys for ia64.


# 44731cab 01-Apr-2002 John Baldwin <jhb@FreeBSD.org>

Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on: smp@


# c23cda85 01-Apr-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Extend a hack to also hack around PC98's definition of __i386__


# 81661c94 31-Mar-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Here follows the new kernel dumping infrastructure.

Caveats:

The new savecore program is not complete in the sense that it emulates
enough of the old savecores features to do the job, but implements none
of the options yet.

I would appreciate if a userland hacker could help me out getting savecore
to do what we want it to do from a users point of view, compression,
email-notification, space reservation etc etc. (send me email if
you are interested).

Currently, savecore will scan all devices marked as "swap" or "dump" in
/etc/fstab _or_ any devices specified on the command-line.

All architectures but i386 lack an implementation of dumpsys(), but
looking at the i386 version it should be trivial for anybody familiar
with the platform(s) to provide this function.

Documentation is quite sparse at this time, more to come.

Details:

ATA and SCSI drivers should work as the dump formatting code has been
removed. The IDA, TWE and AAC have not yet been converted.

Dumpon now opens the device and uses ioctl(DIOCGKERNELDUMP) to set
the device as dumpdev. To implement the "off" argument, /dev/null
is used as the device.

Savecore will fail if handed any options since they are not (yet)
implemented. All devices marked "dump" or "swap" in /etc/fstab
will be scanned and dumps found will be saved to diskfiles
named from the MD5 hash of the header record. The header record
is dumped in readable format in the .info file. The kernel
is not saved. Only complete dumps will be saved.

All maintainer rights for this code are disclaimed: feel free to
improve and extend.

Sponsored by: DARPA, NAI Labs


# 8d19a265 31-Mar-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Centralize the "bootdev" and "dumpdev" variables. They are still pretty
bogus all things considered, but at least now they don't camouflage as
being MD variables.


# 752dff3d 06-Mar-2002 Jake Burkholder <jake@FreeBSD.org>

Add needed includes of machine/smp.h, remove nested include in sys/smp.h
so that inlines in machine/smp.h can use variables declared in sys/smp.h.


# 237a8a02 08-Feb-2002 Julian Elischer <julian@FreeBSD.org>

Replace accidentally removed setrunqueue()
solves problem with machines failing to sync in booting.
Submitted by: Tor.Egge@cvsup.no.freebsd.org


# 079b7bad 07-Feb-2002 Julian Elischer <julian@FreeBSD.org>

Pre-KSE/M3 commit.
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.

Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,


# dcd7d9b7 20-Jan-2002 Maxim Sobolev <sobomax@FreeBSD.org>

Allow dump device be configured as early as possible using loader(8) tunable.
This allows obtaining crash dumps from the panics occured during late stages
of kernel initialisation before system enters into single-user mode.

MFC after: 2 weeks


# 422702e9 18-Jan-2002 Nik Clayton <nik@FreeBSD.org>

Explain that the admin can safely power down the system as well as
rebooting.


# c86b6ff5 05-Jan-2002 John Baldwin <jhb@FreeBSD.org>

Change the preemption code for software interrupt thread schedules and
mutex releases to not require flags for the cases when preemption is
not allowed:

The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent
switching to a higher priority thread on mutex releease and swi schedule,
respectively when that switch is not safe. Now that the critical section
API maintains a per-thread nesting count, the kernel can easily check
whether or not it should switch without relying on flags from the
programmer. This fixes a few bugs in that all current callers of
swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from
fast interrupt handlers and the swi_sched of softclock needed this flag.
Note that to ensure that swi_sched()'s in clock and fast interrupt
handlers do not switch, these handlers have to be explicitly wrapped
in critical_enter/exit pairs. Presently, just wrapping the handlers is
sufficient, but in the future with the fully preemptive kernel, the
interrupt must be EOI'd before critical_exit() is called. (critical_exit()
can switch due to a deferred preemption in a fully preemptive kernel.)

I've tested the changes to the interrupt code on i386 and alpha. I have
not tested ia64, but the interrupt code is almost identical to the alpha
code, so I expect it will work fine. PowerPC and ARM do not yet have
interrupt code in the tree so they shouldn't be broken. Sparc64 is
broken, but that's been ok'd by jake and tmm who will be fixing the
interrupt code for sparc64 shortly.

Reviewed by: peter
Tested on: i386, alpha


# 817805d9 12-Nov-2001 Paul Saab <ps@FreeBSD.org>

Fix a signed bug in the crashdump code for systems with > 2GB of ram.

Reviewed by: peter


# 259ed917 19-Oct-2001 Peter Wemm <peter@FreeBSD.org>

Add a sysctl for preventing the sync() in panic() recovery. This can
be so dangerous it isn't funny. eg: if you panic inside NFS or softdep,
and then try and sync you run into held locks and cause either deadlocks,
recursive panics or other interesting chaos. Default is unchanged.


# fbd7a9dd 20-Sep-2001 Peter Wemm <peter@FreeBSD.org>

decrement the dumping variable after use so we can call it several times
if needed.


# b40ce416 12-Sep-2001 Julian Elischer <julian@FreeBSD.org>

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 04b5a9bb 10-Sep-2001 John Baldwin <jhb@FreeBSD.org>

- Axe holding_giant as it is not used now anyways and was ok'd by
dillon in an earlier e-mail.
- We don't need to test the console right before we vfprintf() the panicstr
message. The printing of the panic message is a fine console test by
itself and doesn't make useful messages scroll off the screen or tick
developers off in quite the same.

Requested by: jlemon, imp, bmilekic, chris, gsutter, jake (2)


# fc8b64e4 05-Sep-2001 Peter Wemm <peter@FreeBSD.org>

Sigh. Dig up text from a signature in a 1994 Usenet post I made and redo
the ..uhh... ``console test'' to avoid another 50 emails about GPL issues.


# 772121fd 01-Sep-2001 Peter Wemm <peter@FreeBSD.org>

The !RESTARTABLE_PANICS code has some loose ends.


# 835a82ee 01-Sep-2001 Matthew Dillon <dillon@FreeBSD.org>

Giant Pushdown. Saved the worst P4 tree breakage for last.

reboot() getpriority() setpriority() rtprio() osetrlimit() ogetrlimit()
setrlimit() getrlimit() getrusage() getpid() getppid() getpgrp()
getpgid() getsid() getgid() getegid() getgroups() setsid() setpgid()
setuid() seteuid() setgid() setegid() setgroups() setreuid() setregid()
setresuid() setresgid() getresuid() getresgid () __setugid() getlogin()
setlogin() modnext() modfnext() modstat() modfind() kldload() kldunload()
kldfind() kldnext() kldstat() kldfirstmod() kldsym() getdtablesize()
dup2() dup() fcntl() close() ofstat() fstat() nfsstat() fpathconf()
flock()


# 1432aa0c 23-Aug-2001 John Baldwin <jhb@FreeBSD.org>

Add a new kernel option RESTARTABLE_PANICS. If this option is present,
then one can restart from a panic by resetting the panicstr variable to
NULL. This commit conditionalizes the previously committed functionality
on this variable. It also removes the __dead2 attribute from the panic()
function so that when one continues from a panic() the behavior will
be predictable.


# 61e96500 21-Aug-2001 John Baldwin <jhb@FreeBSD.org>

Clear db_active in boot() so that one can call the boot function (as well
as use the panic command) w/o having to manually clear db_active first
to avoid the db_error() in mi_switch().


# 1a5333c3 21-Aug-2001 John Baldwin <jhb@FreeBSD.org>

Allow one to restart from a panic in DDB by clearing the panicstr
variable to NULL. Note that since panic() is marked with __dead2, this
has somewhat unpredictable results at best.


# a572c95c 15-Aug-2001 Bruce Evans <bde@FreeBSD.org>

Don't dump on the label sector or below. This avoids clobbering the
label if the dump device overflaps the label (which is a slight
misconfiguration). Dump routines don't use dscheck(), so the normal
write protection of the label doesn't help.

Reduced some nearby overflow bugs. In disk_dumpcheck(), there was
(fatal but fail-safe) overflow on i386's with 4GB of memory, at least
if Maxmem was the top page (can this happen?). The fix assumes that
the sector size divides PAGE_SIZE (dump routines already assume this).
In setdumpdev(), the corresponding overflow occurred with only about
2GB of memory on all machines with 32-bit ints. This allowed setdumpdev()
to succeed when it shouldn't have, but then disk_dumpcheck() failed
safe later. Except in old versions of FreeBSD like RELENG_3 where
there is no disk_dumpcheck().

PR: 28164 (label clobbering part)
MFC after: 1 week


# 1d79f1bb 25-Jun-2001 John Baldwin <jhb@FreeBSD.org>

- Sort includes.
- Count the context switches during shutdown when we give ithreads a chance
to run as volutary context switches.

Submitted by: bde (2)


# 60fb0ce3 28-Apr-2001 Greg Lehey <grog@FreeBSD.org>

Revert consequences of changes to mount.h, part 2.

Requested by: bde


# 6caa8a15 27-Apr-2001 John Baldwin <jhb@FreeBSD.org>

Overhaul of the SMP code. Several portions of the SMP kernel support have
been made machine independent and various other adjustments have been made
to support Alpha SMP.

- It splits the per-process portions of hardclock() and statclock() off
into hardclock_process() and statclock_process() respectively. hardclock()
and statclock() call the *_process() functions for the current process so
that UP systems will run as before. For SMP systems, it is simply necessary
to ensure that all other processors execute the *_process() functions when the
main clock functions are triggered on one CPU by an interrupt. For the alpha
4100, clock interrupts are delievered in a staggered broadcast fashion, so
we simply call hardclock/statclock on the boot CPU and call the *_process()
functions on the secondaries. For x86, we call statclock and hardclock as
usual and then call forward_hardclock/statclock in the MD code to send an IPI
to cause the AP's to execute forwared_hardclock/statclock which then call the
*_process() functions.
- forward_signal() and forward_roundrobin() have been reworked to be MI and to
involve less hackery. Now the cpu doing the forward sets any flags, etc. and
sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically
return so that they can execute ast() and don't bother with setting the
astpending or needresched flags themselves. This also removes the loop in
forward_signal() as sched_lock closes the race condition that the loop worked
around.
- need_resched(), resched_wanted() and clear_resched() have been changed to take
a process to act on rather than assuming curproc so that they can be used to
implement forward_roundrobin() as described above.
- Various other SMP variables have been moved to a MI subr_smp.c and a new
header sys/smp.h declares MI SMP variables and API's. The IPI API's from
machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h.
- The globaldata_register() and globaldata_find() functions as well as the
SLIST of globaldata structures has become MI and moved into subr_smp.c.
Also, the globaldata list is only available if SMP support is compiled in.

Reviewed by: jake, peter
Looked over by: eivind


# d98dc34f 23-Apr-2001 Greg Lehey <grog@FreeBSD.org>

Correct #includes to work with fixed sys/mount.h.


# abd9053e 16-Apr-2001 John Baldwin <jhb@FreeBSD.org>

Blow away the panic mutex in favor of using a single atomic_cmpset() on a
panic_cpu shared variable. I used a simple atomic operation here instead
of a spin lock as it seemed to be excessive overhead. Also, this can avoid
recursive panics if, for example, witness is broken.


# 6b8b8c7f 27-Mar-2001 Paul Saab <ps@FreeBSD.org>

Last commit was broken.. It always prints '[CTRL-C to abort]'.
Move duplicate code for printing the status of the dump and checking
for abort into a separate function.

Pointy hat to: me


# 87729a2b 06-Mar-2001 John Baldwin <jhb@FreeBSD.org>

Lock initproc when we send SIGINT to init during shutdown.


# d888fc4e 11-Feb-2001 Mark Murray <markm@FreeBSD.org>

RIP <machine/lock.h>.

Some things needed bits of <i386/include/lock.h> - cy.c now has its
own (only) copy of the COM_(UN)LOCK() macros, and IMASK_(UN)LOCK()
has been moved to <i386/include/apic.h> (AKA <machine/apic.h>).
Reviewed by: jhb


# 9ed346ba 08-Feb-2001 Bosko Milekic <bmilekic@FreeBSD.org>

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


# 1a6e52d0 06-Feb-2001 Jeroen Ruigrok van der Werven <asmodai@FreeBSD.org>

Fix typo: seperate -> separate.

Seperate does not exist in the english language.


# 1b367556 23-Jan-2001 Jason Evans <jasone@FreeBSD.org>

Convert all simplelocks to mutexes and remove the simplelock implementations.


# ef73ae4b 09-Jan-2001 Jake Burkholder <jake@FreeBSD.org>

Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables
other then curproc.


# ffc831da 15-Dec-2000 John Baldwin <jhb@FreeBSD.org>

Stick the kthread API in a kthread_* namespace, and the specialized kproc
functions in a kproc_* namespace.

Reviewed by: -arch


# 2bcc63c5 28-Nov-2000 John Baldwin <jhb@FreeBSD.org>

Only print out APIC info on an SMP system during a panic if APIC_IO is
defined.


# 20cdcc5b 15-Nov-2000 John Baldwin <jhb@FreeBSD.org>

Don't release and acquire Giant in mi_switch(). Instead, release and
acquire Giant as needed in functions that call mi_switch(). The releases
need to be done outside of the sched_lock to avoid potential deadlocks
from trying to acquire Giant while interrupts are disabled.

Submitted by: witness


# 35e0e5b3 20-Oct-2000 John Baldwin <jhb@FreeBSD.org>

Catch up to moving headers:
- machine/ipl.h -> sys/ipl.h
- machine/mutex.h -> sys/mutex.h


# db7e3af1 15-Oct-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unneeded #include <machine/clock.h>


# ac5f943c 13-Oct-2000 Peter Wemm <peter@FreeBSD.org>

savectx() is now used exclusively by the crash dump system. Move the
i386 specific gunk (copy %cr3 to the pcb) from the MI dumpsys() to the
MD savectx().


# 16a011f9 13-Oct-2000 Paul Saab <ps@FreeBSD.org>

Do not allocate a callout for all crashdumps, not just when you panic.


# 621dbe43 16-Sep-2000 Bruce Evans <bde@FreeBSD.org>

Added used include of <sys/mutex.h> (don't depend on pollution in
<sys/signalvar.h>).


# 4a6404df 11-Sep-2000 John Baldwin <jhb@FreeBSD.org>

Fix some printf format string warnings due to sizeof(int) != sizeof(long) on
the alpha.


# 62820f25 10-Sep-2000 Jason Evans <jasone@FreeBSD.org>

Allow interrupt threads to run during shutdown. This should fix the
"dirty buffers during shutdown" problem introduced by the SMPng commit.

Submitted by: tegge, cg


# 0384fff8 06-Sep-2000 Jason Evans <jasone@FreeBSD.org>

Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The
alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
preempted (i386 only).

Partially contributed by: BSDi (BSD/OS)
Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh


# 82acbcf5 03-Sep-2000 Peter Wemm <peter@FreeBSD.org>

kern_shutdown.c was more ANSI-C than K&R - remove the remnants of K&R
support with extreme prejudice.


# 87de3703 03-Sep-2000 Peter Wemm <peter@FreeBSD.org>

gcc knows that savectx() is potentially a setjmp style dual-return
function which may lead to stack lossage and clobbered variables.
This isn't the case here, but there is no way to tell gcc that.

Work around this in a kinda bizzare way, but it shuts gcc up.


# 3e755f76 30-Aug-2000 Mike Smith <msmith@FreeBSD.org>

Make it possible to pass boot()'s flags to shutdown_nice() so that the
kernel can instigate an orderly shutdown but still determine the form of
that shutdown. Make it possible eg. to cleanly shutdown and power off the
system under ACPI when the power button is pressed.


# 77978ab8 04-Jul-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.

Pointed out by: bde


# 82d9ae4e 03-Jul-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:

Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our
sources:

-sysctl_vm_zone SYSCTL_HANDLER_ARGS
+sysctl_vm_zone (SYSCTL_HANDLER_ARGS)


# 9626b608 05-May-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by: peter


# db6a4261 28-Mar-2000 Matthew Dillon <dillon@FreeBSD.org>

The SMP cleanup commit broke UP compiles. Make UP compiles work again.


# 49503b44 12-Jan-2000 Luoqi Chen <luoqi@FreeBSD.org>

Seconds to ticks conversion was done at the wrong place.


# 5e950839 07-Jan-2000 Luoqi Chen <luoqi@FreeBSD.org>

Introduce a mechanism to suspend/resume system processes. Suspend syncer
and bufdaemon prior to disk sync during system shutdown.


# 9eec6969 06-Dec-1999 Mike Smith <msmith@FreeBSD.org>

Change the default poweroff delay from 0 to 5 seconds. This seems to be
adequate for the IDE disks that I have available for testing. Most seem
to wait between 1 and 3 seconds before flushing their caches.

Add the ability to override the delay at compile time via the
undocumented option POWEROFF_DELAY. The delay can still be set via
sysctl as it was originally implemented.


# 72dfe7a3 06-Dec-1999 Poul-Henning Kamp <phk@FreeBSD.org>

I always forget to check before I reboot a system, and while it
boots I try in vain to remember which month or even year this system
was last booted in.

Print out the uptime before rebooting, and give people like me
less (or more as it may be) to think about while the systems boots.


# ee072c08 28-Nov-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Convert dumpon to work on character devices instead of block devices.

NB: You may need to change your /etc/rc.conf!


# 0429e37a 20-Nov-1999 Poul-Henning Kamp <phk@FreeBSD.org>

struct mountlist and struct mount.mnt_list have no business being
a CIRCLEQ. Change them to TAILQ_HEAD and TAILQ_ENTRY respectively.

This removes ugly mp != (void*)&mountlist comparisons.

Requested by: phk
Submitted by: Jake Burkholder jake@checker.org
PR: 14967


# 9c111b31 08-Nov-1999 Poul-Henning Kamp <phk@FreeBSD.org>

A little bit of nitpicking in the 'syncing disks...' end of a shutdown.


# d1f088da 11-Oct-1999 Peter Wemm <peter@FreeBSD.org>

Trim unused options (or #ifdef for undoc options).

Submitted by: phk


# c5b07219 29-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unneeded "maj" variable.

Give up if we have already started dumping once before.

Print name of dumpdev.


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# d9183205 23-Aug-1999 Bruce Evans <bde@FreeBSD.org>

Use devtoname() to print dev_t's instead of casting them to long or u_long
for misprinting in %lx format.


# fcb893a8 21-Aug-1999 Mike Smith <msmith@FreeBSD.org>

Implement a new generic mechanism for attaching handler functions to
events, in order to pave the way for removing a number of the ad-hoc
implementations currently in use.

Retire the at_shutdown family of functions and replace them with
new event handler lists.

Rework kern_shutdown.c to take greater advantage of the use of event
handlers.

Reviewed by: green


# 7dc5cd04 13-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

The bdevsw() and cdevsw() are now identical, so kill the former.


# 59d5fe5a 11-Aug-1999 Alfred Perlstein <alfred@FreeBSD.org>

When doing a dump, if ENODEV is returned explain what happened to the user,
"the device doesn't support a dump routine"

Only print "dump succeeded" when 0 is returned, instead of when an unexpected
error number is returned, print that error number.

Reviewed by: Eivind


# ce9edcf5 09-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Merge the cons.c and cons.h to the best of my ability. alpha may or
may not compile, I can't test it.


# fb30b5bd 20-Jul-1999 Brian Feldman <green@FreeBSD.org>

Make a dev2budev() function, and use it. This refixes pstat (working, broken,
working, broken, working) and savecore (working, working, broken, working,
working).

Sorta Reviewed by: phk


# 240a86a4 20-Jul-1999 Brian Feldman <green@FreeBSD.org>

dev2udev() returns a CDEV udev_t, but we use block io in savecore. Savecore
also gets the device by st_rdev, which is alright except for the fact that
the sysctl kern.dumpdev passed out a char device. This is a workaround.
Sorry for not committing the fix earlier, before people started having
problems.


# f06a54f0 17-Jul-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Centralize dumpdev handling.


# b9dffbec 01-Jul-1999 Peter Wemm <peter@FreeBSD.org>

Fix a warning - the code is correct but gcc can't tell.


# 67812eac 25-Jun-1999 Kirk McKusick <mckusick@FreeBSD.org>

Convert buffer locking from using the B_BUSY and B_WANTED flags to using
lockmgr locks. This commit should be functionally equivalent to the old
semantics. That is, all buffer locking is done with LK_EXCLUSIVE
requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will
be done in future commits.


# ccb84588 12-May-1999 Peter Wemm <peter@FreeBSD.org>

Try an fix a couple of dev_t/major/minor etc nits.


# 4be2eb8c 08-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

I got tired of seeing all the cdevsw[major(foo)] all over the place.

Made a new (inline) function devsw(dev_t dev) and substituted it.

Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)

DEVFS will eventually benefit from this change too.


# 46eede00 07-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Continue where Julian left off in July 1998:

Virtualize bdevsw[] from cdevsw. bdevsw() is now an (inline)
function.

Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention
to the order of the cmaj/bmaj arguments!)

Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE
(ditto!)

(Next step will be to convert all bdev dev_t's to cdev dev_t's
before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)


# dfd5dee1 06-May-1999 Peter Wemm <peter@FreeBSD.org>

Add sufficient braces to keep egcs happy about potentially ambiguous
if/else nesting.


# 3d177f46 03-May-1999 Bill Fumerola <billf@FreeBSD.org>

Add sysctl descriptions to many SYSCTL_XXXs

PR: kern/11197
Submitted by: Adrian Chadd <adrian@FreeBSD.org>
Reviewed by: billf(spelling/style/minor nits)
Looked at by: bde(style)


# f711d546 27-Apr-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Suser() simplification:

1:
s/suser/suser_xxx/

2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.


# 5c40906f 30-Jan-1999 Mike Smith <msmith@FreeBSD.org>

An error in the last commit; the changes were submitted by, not reviewed by,
"D. Rock" <rock@cs.uni-sb.de>


# db82a982 30-Jan-1999 Mike Smith <msmith@FreeBSD.org>

Add a new sysctl node kern.shutdown, off which shutdown-related things
can be hung.

Add a tunable delay at the beginning of the SHUTDOWN_FINAL at_shutdown
queue, allowing time to settle before we launch into the list of things
that are expected to turn the system off.

Fix a bug in at_shutdown_pri() where the second insertion always put
the item in second position in the queue.

Reviewed by: "D. Rock" <rock@cs.uni-sb.de>


# 9959b1a8 28-Dec-1998 Mike Smith <msmith@FreeBSD.org>

Improved DDB_UNATTENDED behaviour. From the submitter:

There's something that's been bugging me for a while, so I decided to fix it.
FreeBSD now will DTRT WRT DDB and DDB_UNATTENDED (!debugger_on_panic), at least
in my opinion. The behavior change is such that:

1. Nothing changes when debugger_on_panic != 0.
2. When DDB_UNATTENDED (!debugger_on_panic), if a panic occurs, the
machine will reboot. Also, if a trap occurs, the machine will
panic and reboot, unlike how it broke to DDB before. HOWEVER,
a trap inside DDB will not cause a panic, allowing full use
of DDB without having to worry about the machine being stuck
at a DDB prompt if something goes wrong during the day.
Patches for this behavior follow my signature, and it would
be a boon to anyone (like me) who uses DDB_UNATTENDED, but
actually wants the machine to panic on a trap (otherwise,
what's the use, if the machine causes a fatal trap rather than
a true panic, of debugger_on_panic?). The changes cause no
adverse behavior, but do involve two symbols becoming global

Submitted by: Brian Feldman <green@unixhelp.org>


# 2127f260 04-Dec-1998 Archie Cobbs <archie@FreeBSD.org>

Examine all occurrences of sprintf(), strcat(), and str[n]cpy()
for possible buffer overflow problems. Replaced most sprintf()'s
with snprintf(); for others cases, added terminating NUL bytes where
appropriate, replaced constants like "16" with sizeof(), etc.

These changes include several bug fixes, but most changes are for
maintainability's sake. Any instance where it wasn't "immediately
obvious" that a buffer overflow could not occur was made safer.

Reviewed by: Bruce Evans <bde@zeta.org.au>
Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by: Mike Spengler <mks@networkcs.com>


# d02d6d04 13-Nov-1998 Mike Smith <msmith@FreeBSD.org>

Don't count non-local dirty buffers as outstanding when shutting down.
This avoids the fsck-on-reboot symptoms if you're shutting down with a
hung or unreachable NFS server mounted. Also remove non-local
filesystems from the mount list to prevent the system hanging when it tries
to unmount them (for the same reason).

Drew points out that there's a good argument for forcibly removing all
"non syncable" filesystems from the mount list (eg. NFS mounts, disks
that aren't responding, etc.) as this then allows you to sync and
cleanly unmount their parents. No such change is included in this
patch.

Submitted by: Andrew Gallatin <gallatin@cs.duke.edu>


# 35d27a0f 29-Oct-1998 Mike Smith <msmith@FreeBSD.org>

Add the ability to specify where on the at_shutdown queue a handler is
installed.

Remove cpu_power_down, and replace it with an entry at the end of the
SHUTDOWN_FINAL queue in the only place it's used (APM).

Submitted by: Some ideas from Bruce Walter <walter@fortean.com>


# a511bf18 20-Sep-1998 Dmitrij Tejblum <dt@FreeBSD.org>

Fix precedence bug, so that kernel dump works.


# 2cfa0a03 15-Sep-1998 Justin T. Gibbs <gibbs@FreeBSD.org>

Add a new at_shutdown queue, SHUTDOWN_FINAL. This queue is run at
splhigh() after any system dumps have completed. SHUTDOWN_POST_SYNC
isn't quite late enough for disk controllers.

Converted at_shutdown queues to use the queue(3) macros.


# 99237364 06-Sep-1998 Andrey A. Chernov <ache@FreeBSD.org>

Store formatted panic string in static buffer to make it available later
for savecore.
Previous code give only panic format to savecore


# 70d154a6 23-Aug-1998 Dag-Erling Smørgrav <des@FreeBSD.org>

Don't check minor number of dump device at all.

Discussed-with: Jörg Wunsch


# 9103e864 19-Aug-1998 Dag-Erling Smørgrav <des@FreeBSD.org>

Include opt_devfs.h which defines SLICE, to make previous commit
meaningful.

Pointed out by: Luoqi Chen


# d08b9c13 16-Aug-1998 Dag-Erling Smørgrav <des@FreeBSD.org>

Enable kernel dumps on SLICE systems.


# ac1e407b 11-Jul-1998 Bruce Evans <bde@FreeBSD.org>

Fixed printf format errors.


# ecbb00a2 07-Jun-1998 Doug Rabson <dfr@FreeBSD.org>

This commit fixes various 64bit portability problems required for
FreeBSD/alpha. The most significant item is to change the command
argument to ioctl functions from int to u_long. This change brings us
inline with various other BSD versions. Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.


# 2f1e7069 17-May-1998 Tor Egge <tegge@FreeBSD.org>

Add forwarding of roundrobin to other cpus. This gives a more regular
update of cpu usage as shown by top when one process is cpu bound
(no system calls) while the system is otherwise idle (except for top).

Don't attempt to switch to the BSP in boot(). If the system was idle when
an interrupt caused a panic, this won't work. Instead, switch to the BSP
in cpu_reset.

Remove some spurious forward_statclock/forward_hardclock warnings.


# b322fb5d 12-May-1998 Bruce Evans <bde@FreeBSD.org>

Backed out previous commit. It is invalid to call d_ioctl() on
possibly non-open devices, and we don't want to restrict dumping
to swap devices anwyay. It is especially invalid to call d_ioctl()
in non-process context for panics. d_psize() can be called on
non-open devices, at least on non-SLICED ones that support d_dump(),
and setdumpdev() has depended on this for a long time although it
is probably wrong, but even d_psize() can't be called in non-process
context - that's why dumpsys() depends on previously computed values
although these values may be stale. The historical restriction to
devices with dkpart(dev) == SWAP_PART should go away.


# 7f2f1b78 06-May-1998 Julian Elischer <julian@FreeBSD.org>

Add dump support to the DEVFS/slice code.
now we can actually catch our crashes :-)

Submitted by: Luoqi Chen <luoqi@chen.ml.org> (the man who's everywhere)


# b1897c19 08-Mar-1998 Julian Elischer <julian@FreeBSD.org>

Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman)
Submitted by: Kirk McKusick (mcKusick@mckusick.com)
Obtained from: WHistle development tree


# d94f38ac 16-Feb-1998 Eivind Eklund <eivind@FreeBSD.org>

Add HW_WDOG to LINT, and turn it into a new-style option.


# 95802bf8 25-Nov-1997 Julian Elischer <julian@FreeBSD.org>

Shift a few SYSINT() calls around.
this results in a few functions becoming static, and
the SYSINITs being close to the code they are related to.
setting up the dump device is with dumpsys() and
kicking off the scheduler is with the scheduler.
Mounting root is with the code that does it.

Reviewed by: phk


# fc8f7066 18-Nov-1997 Bruce Evans <bde@FreeBSD.org>

Get buffer stuff by #including <sys/buf.h> instead of <sys/vnode.h>.

Staticized boot().

Fixed a gratuitous ANSIism.


# cb226aaa 06-Nov-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.


# 279a6932 05-Sep-1997 Peter Wemm <peter@FreeBSD.org>

Cosmetic adjustment for the trap/double fault/panic cpu id listing.
It now prints the apic id in hex rather than decimal.


# e4ba6a82 02-Sep-1997 Bruce Evans <bde@FreeBSD.org>

Removed unused #includes.


# 6d58e6cb 31-Aug-1997 Bruce Evans <bde@FreeBSD.org>

Fixed options SHOW_BUSYBUFS and PANIC_REBOOT_WAIT_TIME which were broken
by incomplete cutting and pasting from machdep.c to kern_shutdown.c.

PR: 3953


# 90bcb528 26-Aug-1997 Peter Wemm <peter@FreeBSD.org>

Correct some things I forgot about until it was too late with smp_active.
smp_active = 1 used to indicate that the system had frozen previously
started AP's, while smp_active = 0 was "AP's not yet started". I have split
this into smp_started (which is set when the AP's come online), and
smp_active is left for turning on/off AP scheduling.


# 9a629c93 25-Aug-1997 Bruce Evans <bde@FreeBSD.org>

Fixed some formatting and style bugs.

Fixed a gratuitous ANSIism.


# 63fe995c 08-Aug-1997 Julian Elischer <julian@FreeBSD.org>

Teach both disk drivers how to cope with a hardware watchdog
while dumping core.. I'm tired of getting 1/2 of a core-dump

conditional on -DHW_WDOG for now
this will migrate to 2.2 as that's where I need it.


# 5230cfd2 08-Aug-1997 Julian Elischer <julian@FreeBSD.org>

Use up 4 precious bytes to give the kernel a hook to
support hardware watchdogs. The actual functions would be supplied in an LKM
or a linked file, but they need to hang off something.


# b3196e4b 22-Jun-1997 Peter Wemm <peter@FreeBSD.org>

Preliminary support for per-cpu data pages.

This eliminates a lot of #ifdef SMP type code. Things like _curproc reside
in a data page that is unique on each cpu, eliminating the expensive macros
like: #define curproc (SMPcurproc[cpunumber()])

There are some unresolved bootstrap and address space sharing issues at
present, but Steve is waiting on this for other work. There is still some
strictly temporary code present that isn't exactly pretty.

This is part of a larger change that has run into some bumps, this part is
standalone so it should be safe. The temporary code goes away when the
full idle cpu support is finished.

Reviewed by: fsmp, dyson


# 3f777345 14-Jun-1997 Garrett Wollman <wollman@FreeBSD.org>

When APM is configured, turn off the power when halting for good.


# 47d81897 24-May-1997 Steve Passe <fsmp@FreeBSD.org>

Move the printing of "cpu#%d" to AFTER the general panic argument string.
When a panic occurs early in the SMP boot process 'cpunumber()' hangs,
causing the panic string to be lost. Now the system appears to hang
in 'breakpoint()', but at least the user sees the panic string before the
hang.


# 477a642c 26-Apr-1997 Peter Wemm <peter@FreeBSD.org>

Man the liferafts! Here comes the long awaited SMP -> -current merge!

There are various options documented in i386/conf/LINT, there is more to
come over the next few days.

The kernel should run pretty much "as before" without the options to
activate SMP mode.

There are a handful of known "loose ends" that need to be fixed, but
have been put off since the SMP kernel is in a moderately good condition
at the moment.

This commit is the result of the tinkering and testing over the last 14
months by many people. A special thanks to Steve Passe for implementing
the APIC code!


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# ac0ad63f 16-Jan-1997 Bruce Evans <bde@FreeBSD.org>

Reduced #include spam in <sys/sysproto.h> and fixed things that depended
on it.

makesyscalls.sh:
This parsed $Id$. Fixed(?) to parse $FreeBSD$. The output is wrong when
the id is not expanded in the source file.

syscalls.master:
Fixed declaration of sigsuspend(). There are still some bogons and
spam involving sigset_t.
Use `struct foo *' instead of the equivalent `foo_t *' for some nfs and
lfs syscalls so that <sys/sysproto.h> doesn't depend on <sys/mount.h>.


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# d13d3630 30-Oct-1996 Julian Elischer <julian@FreeBSD.org>

Further improved version of hadling a HALT when there is no console.


# 75680b05 30-Oct-1996 Julian Elischer <julian@FreeBSD.org>

if there is no console, cngetc should act like getc and return -1

make callers aware of this in those cases where it can occur.


# a7f8f2ab 13-Sep-1996 Bruce Evans <bde@FreeBSD.org>

Changed cncheckc() interface so that it is 8-bit clean - return -1
instead of 0 if there is no input.


# fc0b1dbf 13-Sep-1996 Bruce Evans <bde@FreeBSD.org>

Don't use __dead in the kernel. It was an obfuscation for gcc >= 2.5
and a no-op for gcc >= 2.6.


# 91916800 07-Sep-1996 Søren Schmidt <sos@FreeBSD.org>

Fixed two small leftovers form PHK's mega devconf removal commit..


# bfbb029d 06-Sep-1996 Poul-Henning Kamp <phk@FreeBSD.org>

Remove devconf, it never grew up to be of any use.


# 18cb99e9 26-Aug-1996 Julian Elischer <julian@FreeBSD.org>

Remove the old cleanup code as it is no longer used..
also fix two cases of = instead of ==
(cut+paste bug duplication)


# e0d898b4 21-Aug-1996 Julian Elischer <julian@FreeBSD.org>

Some cleanups to the callout lists recently added.
note that at_shutdown has a new parameter to indicate When
during a shutdown the callout should be made. also
add a RB_POWEROFF flag to reboot "howto" parameter..
tells the reboot code in our at_shutdown module to turn off the UPS
and kill the power. bound to be useful eventually on laptops


# 269fb9d7 19-Aug-1996 Julian Elischer <julian@FreeBSD.org>

Collect all the functioons concerned with rebooting into one place
also add the at_shutdown callout list, and change the one user of
the present (broken) method (the vn driver) to use the new scheme.


# ad4240fe 18-Aug-1996 Julian Elischer <julian@FreeBSD.org>

move all functions related to shutting down to one file
called kern_shutdown.c

note: I couldn't see anything machine dependant in the
functions boot() and dumpsys() which were in machdep.c
I have left a prototype for cpu_boot() which would go in
machdep.c, but I have nothing to put in it. Iexpect others will
let me know in no uncertain ways that this or that is machine dependant
and should be there, but I'll way for that to happen.. :)

I haven't actually taken the functions OUT of machdep
or anywhere else yet.. I'm checking in this file so others can have a look
at it and comment. SO PLEASE DO COMMENT!

I am also (in another checkin) addinf a man(9) page for the new
at_shotdown().. er freudian slip there.. at_shutdown() call
so have a look at that (and at_exit and at_fork as well)
and feed me comments..

I'll heck in the changes to make these (shutdown) changes active tomorrow
if no-one objects too strongly..