History log of /freebsd-current/sys/sys/kernel.h
Revision Date Author Comments
# 29363fb4 23-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by: Netflix


# 71679cf4 05-Sep-2023 Colin Percival <cperciva@FreeBSD.org>

init_main: Switch from SLIST to STAILQ, fix order

Constructing an SLIST of SYSINITs by inserting them one by one at the
head of the list resulted in them being sorted in anti-stable order:
When two SYSINITs tied for (subsystem, order), they were executed in
the reverse order to the order in which they appeared in the linker
set.

Note that while this changes struct sysinit, it doesn't affect ABI
since SLIST_ENTRY and STAILQ_ENTRY are compatible (in both cases a
single pointer to the next element).

Fixes: 9a7add6d01f3 "init_main: Switch from sysinit array to SLIST"
Reported by: gallatin
Reviewed by: jhb, gallatin, emaste
Tested by: gallatin
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41748


# cedc82c0 17-Jul-2023 Colin Percival <cperciva@FreeBSD.org>

struct sysinit: Add SLIST_ENTRY(sysinit) next

This will be used to put SYSINITs onto a linked list.

Reviewed by: jhb, emaste
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D41074


# 2ff63af9 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .h pattern

Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/


# ae062ff2 12-Nov-2021 Andrew Turner <andrew@FreeBSD.org>

Move KHELP_DECLARE_MOD_UMA later in the boot

Both KHELP_DECLARE_MOD_UMA and the kernel linker SYSINIT to find
in-kernel modules run at SI_SUB_KLD, SI_ORDER_ANY. As the former
depends on the latter running first move it later in the boot,
to the new SI_SUB_KHELP. This ensures KHELP_DECLARE_MOD_UMA
module SYSINIT functions will be after the kernel linker.

Previously we may have received a panic similar to the following if
the order was incorrect:

panic: module_register_init: module named ertt not found

Reported by: bob prohaska <fbsd AT www.zefox.net>
Discussed with: imp, jhb
Sponsored by: The FreeBSD Foundation


# e5236836 11-Mar-2021 Warner Losh <imp@FreeBSD.org>

config_intrhook: provide config_intrhook_drain

config_intrhook_drain will remove the hook from the list as
config_intrhook_disestablish does if the hook hasn't been called. If it has,
config_intrhook_drain will wait for the hook to be disestablished in the normal
course (or expedited, it's up to the driver to decide how and when
to call config_intrhook_disestablish).

This is intended for removable devices that use config_intrhook and might be
attached early in boot, but that may be removed before the kernel can call the
config_intrhook or before it ends. To prevent all races, the detach routine will
need to call config_intrhook_train.

Sponsored by: Netflix, Inc
Reviewed by: jhb, mav, gde (in D29006 for man page)
Differential Revision: https://reviews.freebsd.org/D29005


# 88a55912 08-Mar-2021 Warner Losh <imp@FreeBSD.org>

config_intrhook: Move from TAILQ to STAILQ and padding

config_intrhook doesn't need to be a two-pointer TAILQ. We rarely add/delete
from this and so those need not be optimized. Instaed, use the one-pointer
STAILQ plus a uintptr_t to be used as a flags word. This will allow these
changes to be MFC'd to 12 and 13 to fix a race in removable devices.

Feedback from: jhb
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D29004


# 45f65081 21-Sep-2020 Mitchell Horne <mhorne@FreeBSD.org>

Hide tunable definitions behind _KERNEL

Some userspace code include sys/kernel.h. Namely, some OpenZFS tests do
this, and it was causing breakage after r365945 due to a lack of bool
typedef. Userspace should not need the TUNABLE_** stuff, so hide it
behind an #ifdef _KERNEL.

Sorry for the breakage.

Reported by: andrew, Michael Butler, Jenkins
Discussed with: kevans, allanjude


# cba446e2 21-Sep-2020 Mitchell Horne <mhorne@FreeBSD.org>

Add getenv(9) boolean parsing functions

This adds the getenv_bool() function, to parse a boolean value from a
kernel environment variable or tunable. This works for traditional
boolean values like "0" and "1", and also "true" and "false"
(case-insensitive). These semantics do not yet apply to sysctls declared
using SYSCTL_BOOL with CTLFLAG_TUN (they still only parse 1 and 0).

Also added are two wrapper functions, getenv_is_true() and
getenv_is_false(). These are slightly simpler for callers wishing to
perform a single check of a configuration variable.

Reviewed by: jhb (slightly earlier version)
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26270


# f6e54eb3 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

sys: clean up empty lines in .c and .h files


# fffcb56f 06-Mar-2020 Mark Johnston <markj@FreeBSD.org>

Add COUNTER_U64_SYSINIT() and COUNTER_U64_DEFINE_EARLY().

The aim is to reduce the boilerplate needed today to define and
initialize global counters. Also add SI_SUB_COUNTER to the sysinit
ordering.

Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23977


# 4d4fcf9c 06-Mar-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

Define more subsystem orders.
Intended for use with module_init_order() in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 7993a104 22-Nov-2019 Conrad Meyer <cem@FreeBSD.org>

Add explicit SI_SUB_EPOCH

Add explicit SI_SUB_EPOCH, after SI_SUB_TASKQ and before SI_SUB_SMP
(EARLY_AP_STARTUP). Rename existing "SI_SUB_TASKQ + 1" to SI_SUB_EPOCH.

epoch(9) consumers cannot epoch_alloc() before SI_SUB_EPOCH:SI_ORDER_SECOND,
but likely should allocate before SI_SUB_SMP. Prior to this change,
consumers (well, epoch itself, and net/if.c) just open-coded the
SI_SUB_TASKQ + 1 order to match epoch.c, but this was fragile.

Reviewed by: mmacy
Differential Revision: https://reviews.freebsd.org/D22503


# 661722e7 21-Mar-2018 Konstantin Belousov <kib@FreeBSD.org>

Move sysinit and sysuninit linker sets in the data (writeable) section.

Both sets are sorted in place, and with the introduction of read-only
permissions on the amd64 kernel text, the sorting override depended on
CR0.WP turned off. Make it correct by moving the sets into writeable
part of the KVA, also fixing boot on machines where hand-off from BIOS
to OS occurs with CR0.WP set.

Based on submission by: Peter Lei <peter.lei@ieee.org>
MFC after: 1 week


# 95e678f9 31-Dec-2017 Colin Percival <cperciva@FreeBSD.org>

Use the TSLOG framework to record SYSINIT entry/exit timestamps.


# df57947f 18-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

spdx: initial adoption of licensing ID tags.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes: yes
Differential Revision: https://reviews.freebsd.org/D13133


# fe0593d7 16-Aug-2017 Conrad Meyer <cem@FreeBSD.org>

Add SI_SUB_TASKQ after SI_SUB_INTR and move taskqueue initialization there for EARLY_AP_STARTUP

This fixes a regression accidentally introduced in r322588, due to an
interaction with EARLY_AP_STARTUP.

Reviewed by: bdrewery@, jhb@
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12053


# 2db14f97 13-Aug-2017 Ian Lepore <ian@FreeBSD.org>

Add config_intrhook_oneshot(): schedule an intrhook function and unregister
it automatically after it runs.

The config_intrhook mechanism allows a driver to stall the boot process
until device(s) required for booting are available, by not allowing system
inits to proceed until all intrhook functions have been unregistered.
Virtually all existing code simply unregisters from within the hook function
when it gets called.

This new function makes that common usage more convenient. Instead of
allocating and filling in a struct, passing it to a function that might (in
theory) fail, and checking the return code, now a driver can simply call
this cannot-fail routine, passing just the intrhook function and its arg.

Differential Revision: https://reviews.freebsd.org/D11963


# 97222e17 27-Mar-2017 Konstantin Belousov <kib@FreeBSD.org>

Fix TUNABLE_UINT64() on 32bit architectures.

The macro is not used in the tree.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 3f58662d 01-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

The pr_destroy field does not allow us to run the teardown code in a
specific order. VNET_SYSUNINITs however are doing exactly that.
Thus remove the VIMAGE conditional field from the domain(9) protosw
structure and replace it with VNET_SYSUNINITs.
This also allows us to change some order and to make the teardown functions
file local static.
Also convert divert(4) as it uses the same mechanism ip(4) and ip6(4) use
internally.

Slightly reshuffle the SI_SUB_* fields in kernel.h and add a new ones, e.g.,
for pfil consumers (firewalls), partially for this commit and for others
to come.

Reviewed by: gnn, tuexen (sctp), jhb (kernel.h)
Obtained from: projects/vnet
MFC after: 2 weeks
X-MFC: do not remove pr_destroy
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6652


# fdce57a0 14-May-2016 John Baldwin <jhb@FreeBSD.org>

Add an EARLY_AP_STARTUP option to start APs earlier during boot.

Currently, Application Processors (non-boot CPUs) are started by
MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until
SI_SUB_SMP at which point they are released to run kernel threads.
SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter
the scheduler and start running threads until fairly late in the
boot.

This change moves SI_SUB_SMP up to just before software interrupt
threads are created allowing the APs to start executing kernel
threads much sooner (before any devices are probed). This allows
several initialization routines that need to perform initialization
on all CPUs to now perform that initialization in one step rather
than having to defer the AP initialization to a second SYSINIT run
at SI_SUB_SMP. It also permits all CPUs to be available for
handling interrupts before any devices are probed.

This last feature fixes a problem on with interrupt vector exhaustion.
Specifically, in the old model all device interrupts were routed
onto the boot CPU during boot. Later after the APs were released at
SI_SUB_SMP, interrupts were redistributed across all CPUs.

However, several drivers for multiqueue hardware allocate N interrupts
per CPU in the system. In a system with many CPUs, just a few drivers
doing this could exhaust the available pool of interrupt vectors on
the boot CPU as each driver was allocating N * mp_ncpu vectors on the
boot CPU. Now, drivers will allocate interrupts on their desired CPUs
during boot meaning that only N interrupts are allocated from the boot
CPU instead of N * mp_ncpu.

Some other bits of code can also be simplified as smp_started is
now true much earlier and will now always be true for these bits of
code. This removes the need to treat the single-CPU boot environment
as a special case.

As a transition aid, the new behavior is available under a new kernel
option (EARLY_AP_STARTUP). This will allow the option to be turned off
if need be during initial testing. I plan to enable this on x86 by
default in a followup commit in the next few days and to have all
platforms moved over before 11.0. Once the transition is complete,
the option will be removed along with the !EARLY_AP_STARTUP code.

These changes have only been tested on x86. Other platform maintainers
are encouraged to port their architectures over as well. The main
things to check for are any uses of smp_started in MD code that can be
simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in
the EARLY_AP_STARTUP case (e.g. the interrupt shuffling).

PR: kern/199321
Reviewed by: markj, gnn, kib
Sponsored by: Netflix


# 15be49f5 14-Apr-2016 Warner Losh <imp@FreeBSD.org>

Create wrappers for uint64_t and int64_t for the tunables. While not
strictly necessary, it is more convenient.


# 1f12da0e 22-Jan-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Just checkpoint the WIP in order to be able to make the tree update
easier. Note: this is currently not in a usable state as certain
teardown parts are not called and the DOMAIN rework is missing.
More to come soon and find its way to head.

Obtained from: P4 //depot/user/bz/vimage/...
Sponsored by: The FreeBSD Foundation


# 55811a03 08-Oct-2015 Edward Tomasz Napierala <trasz@FreeBSD.org>

Remove comment obsoleted by r289056.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# 0ad53012 08-Oct-2015 Edward Tomasz Napierala <trasz@FreeBSD.org>

Remove unused SI_SUB_* #defines.

Reviewed by: kib@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3708


# d1b06863 30-Jun-2015 Mark Murray <markm@FreeBSD.org>

Huge cleanup of random(4) code.

* GENERAL
- Update copyright.
- Make kernel options for RANDOM_YARROW and RANDOM_DUMMY. Set
neither to ON, which means we want Fortuna
- If there is no 'device random' in the kernel, there will be NO
random(4) device in the kernel, and the KERN_ARND sysctl will
return nothing. With RANDOM_DUMMY there will be a random(4) that
always blocks.
- Repair kern.arandom (KERN_ARND sysctl). The old version went
through arc4random(9) and was a bit weird.
- Adjust arc4random stirring a bit - the existing code looks a little
suspect.
- Fix the nasty pre- and post-read overloading by providing explictit
functions to do these tasks.
- Redo read_random(9) so as to duplicate random(4)'s read internals.
This makes it a first-class citizen rather than a hack.
- Move stuff out of locked regions when it does not need to be
there.
- Trim RANDOM_DEBUG printfs. Some are excess to requirement, some
behind boot verbose.
- Use SYSINIT to sequence the startup.
- Fix init/deinit sysctl stuff.
- Make relevant sysctls also tunables.
- Add different harvesting "styles" to allow for different requirements
(direct, queue, fast).
- Add harvesting of FFS atime events. This needs to be checked for
weighing down the FS code.
- Add harvesting of slab allocator events. This needs to be checked for
weighing down the allocator code.
- Fix the random(9) manpage.
- Loadable modules are not present for now. These will be re-engineered
when the dust settles.
- Use macros for locks.
- Fix comments.

* src/share/man/...
- Update the man pages.

* src/etc/...
- The startup/shutdown work is done in D2924.

* src/UPDATING
- Add UPDATING announcement.

* src/sys/dev/random/build.sh
- Add copyright.
- Add libz for unit tests.

* src/sys/dev/random/dummy.c
- Remove; no longer needed. Functionality incorporated into randomdev.*.

* live_entropy_sources.c live_entropy_sources.h
- Remove; content moved.
- move content to randomdev.[ch] and optimise.

* src/sys/dev/random/random_adaptors.c src/sys/dev/random/random_adaptors.h
- Remove; plugability is no longer used. Compile-time algorithm
selection is the way to go.

* src/sys/dev/random/random_harvestq.c src/sys/dev/random/random_harvestq.h
- Add early (re)boot-time randomness caching.

* src/sys/dev/random/randomdev_soft.c src/sys/dev/random/randomdev_soft.h
- Remove; no longer needed.

* src/sys/dev/random/uint128.h
- Provide a fake uint128_t; if a real one ever arrived, we can use
that instead. All that is needed here is N=0, N++, N==0, and some
localised trickery is used to manufacture a 128-bit 0ULLL.

* src/sys/dev/random/unit_test.c src/sys/dev/random/unit_test.h
- Improve unit tests; previously the testing human needed clairvoyance;
now the test will do a basic check of compressibility. Clairvoyant
talent is still a good idea.
- This is still a long way off a proper unit test.

* src/sys/dev/random/fortuna.c src/sys/dev/random/fortuna.h
- Improve messy union to just uint128_t.
- Remove unneeded 'static struct fortuna_start_cache'.
- Tighten up up arithmetic.
- Provide a method to allow eternal junk to be introduced; harden
it against blatant by compress/hashing.
- Assert that locks are held correctly.
- Fix the nasty pre- and post-read overloading by providing explictit
functions to do these tasks.
- Turn into self-sufficient module (no longer requires randomdev_soft.[ch])

* src/sys/dev/random/yarrow.c src/sys/dev/random/yarrow.h
- Improve messy union to just uint128_t.
- Remove unneeded 'staic struct start_cache'.
- Tighten up up arithmetic.
- Provide a method to allow eternal junk to be introduced; harden
it against blatant by compress/hashing.
- Assert that locks are held correctly.
- Fix the nasty pre- and post-read overloading by providing explictit
functions to do these tasks.
- Turn into self-sufficient module (no longer requires randomdev_soft.[ch])
- Fix some magic numbers elsewhere used as FAST and SLOW.

Differential Revision: https://reviews.freebsd.org/D2025
Reviewed by: vsevolod,delphij,rwatson,trasz,jmg
Approved by: so (delphij)


# 089f02b0 06-May-2014 Robert Watson <rwatson@FreeBSD.org>

Spell raccdt in a more conventional way in a comment.

MFC after: 3 days


# 0e452337 05-May-2014 Robert Watson <rwatson@FreeBSD.org>

Garbage collect two more unused sysinit subsystems: SI_SUB_KVM_RSRC and
SI_SUB_CLISTS.

MFC after: 3 days


# a2496f6e 02-May-2014 Robert Watson <rwatson@FreeBSD.org>

Garbage collect mtxpool_lockbuilder, the mutex pool historically used
for lockmgr and sx interlocks, but unused since optimised versions of
those sleep locks were introduced. This will save a (quite) small
amount of memory in all kernel configurations. The sleep mutex pool is
retained as it is used for 'struct bio' and several other consumers.

Discussed with: jhb
MFC after: 3 days


# 76acc41f 29-Aug-2013 Justin T. Gibbs <gibbs@FreeBSD.org>

Implement vector callback for PVHVM and unify event channel implementations

Re-structure Xen HVM support so that:
- Xen is detected and hypercalls can be performed very
early in system startup.
- Xen interrupt services are implemented using FreeBSD's native
interrupt delivery infrastructure.
- the Xen interrupt service implementation is shared between PV
and HVM guests.
- Xen interrupt handlers can optionally use a filter handler
in order to avoid the overhead of dispatch to an interrupt
thread.
- interrupt load can be distributed among all available CPUs.
- the overhead of accessing the emulated local and I/O apics
on HVM is removed for event channel port events.
- a similar optimization can eventually, and fairly easily,
be used to optimize MSI.

Early Xen detection, HVM refactoring, PVHVM interrupt infrastructure,
and misc Xen cleanups:

Sponsored by: Spectra Logic Corporation

Unification of PV & HVM interrupt infrastructure, bug fixes,
and misc Xen cleanups:

Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D

sys/x86/x86/local_apic.c:
sys/amd64/include/apicvar.h:
sys/i386/include/apicvar.h:
sys/amd64/amd64/apic_vector.S:
sys/i386/i386/apic_vector.s:
sys/amd64/amd64/machdep.c:
sys/i386/i386/machdep.c:
sys/i386/xen/exception.s:
sys/x86/include/segments.h:
Reserve IDT vector 0x93 for the Xen event channel upcall
interrupt handler. On Hypervisors that support the direct
vector callback feature, we can request that this vector be
called directly by an injected HVM interrupt event, instead
of a simulated PCI interrupt on the Xen platform PCI device.
This avoids all of the overhead of dealing with the emulated
I/O APIC and local APIC. It also means that the Hypervisor
can inject these events on any CPU, allowing upcalls for
different ports to be handled in parallel.

sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
Map Xen per-vcpu area during AP startup.

sys/amd64/include/intr_machdep.h:
sys/i386/include/intr_machdep.h:
Increase the FreeBSD IRQ vector table to include space
for event channel interrupt sources.

sys/amd64/include/pcpu.h:
sys/i386/include/pcpu.h:
Remove Xen HVM per-cpu variable data. These fields are now
allocated via the dynamic per-cpu scheme. See xen_intr.c
for details.

sys/amd64/include/xen/hypercall.h:
sys/dev/xen/blkback/blkback.c:
sys/i386/include/xen/xenvar.h:
sys/i386/xen/clock.c:
sys/i386/xen/xen_machdep.c:
sys/xen/gnttab.c:
Prefer FreeBSD primatives to Linux ones in Xen support code.

sys/amd64/include/xen/xen-os.h:
sys/i386/include/xen/xen-os.h:
sys/xen/xen-os.h:
sys/dev/xen/balloon/balloon.c:
sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/console/xencons_ring.c:
sys/dev/xen/control/control.c:
sys/dev/xen/netback/netback.c:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/xenpci/xenpci.c:
sys/i386/i386/machdep.c:
sys/i386/include/pmap.h:
sys/i386/include/xen/xenfunc.h:
sys/i386/isa/npx.c:
sys/i386/xen/clock.c:
sys/i386/xen/mp_machdep.c:
sys/i386/xen/mptable.c:
sys/i386/xen/xen_clock_util.c:
sys/i386/xen/xen_machdep.c:
sys/i386/xen/xen_rtc.c:
sys/xen/evtchn/evtchn_dev.c:
sys/xen/features.c:
sys/xen/gnttab.c:
sys/xen/gnttab.h:
sys/xen/hvm.h:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_if.m:
sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenstore/xenstore.c:
sys/xen/xenstore/xenstore_dev.c:
sys/xen/xenstore/xenstorevar.h:
Pull common Xen OS support functions/settings into xen/xen-os.h.

sys/amd64/include/xen/xen-os.h:
sys/i386/include/xen/xen-os.h:
sys/xen/xen-os.h:
Remove constants, macros, and functions unused in FreeBSD's Xen
support.

sys/xen/xen-os.h:
sys/i386/xen/xen_machdep.c:
sys/x86/xen/hvm.c:
Introduce new functions xen_domain(), xen_pv_domain(), and
xen_hvm_domain(). These are used in favor of #ifdefs so that
FreeBSD can dynamically detect and adapt to the presence of
a hypervisor. The goal is to have an HVM optimized GENERIC,
but more is necessary before this is possible.

sys/amd64/amd64/machdep.c:
sys/dev/xen/xenpci/xenpcivar.h:
sys/dev/xen/xenpci/xenpci.c:
sys/x86/xen/hvm.c:
sys/sys/kernel.h:
Refactor magic ioport, Hypercall table and Hypervisor shared
information page setup, and move it to a dedicated HVM support
module.

HVM mode initialization is now triggered during the
SI_SUB_HYPERVISOR phase of system startup. This currently
occurs just after the kernel VM is fully setup which is
just enough infrastructure to allow the hypercall table
and shared info page to be properly mapped.

sys/xen/hvm.h:
sys/x86/xen/hvm.c:
Add definitions and a method for configuring Hypervisor event
delievery via a direct vector callback.

sys/amd64/include/xen/xen-os.h:
sys/x86/xen/hvm.c:

sys/conf/files:
sys/conf/files.amd64:
sys/conf/files.i386:
Adjust kernel build to reflect the refactoring of early
Xen startup code and Xen interrupt services.

sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
sys/dev/xen/control/control.c:
sys/dev/xen/evtchn/evtchn_dev.c:
sys/dev/xen/netback/netback.c:
sys/dev/xen/netfront/netfront.c:
sys/xen/xenstore/xenstore.c:
sys/xen/evtchn/evtchn_dev.c:
sys/dev/xen/console/console.c:
sys/dev/xen/console/xencons_ring.c
Adjust drivers to use new xen_intr_*() API.

sys/dev/xen/blkback/blkback.c:
Since blkback defers all event handling to a taskqueue,
convert this task queue to a "fast" taskqueue, and schedule
it via an interrupt filter. This avoids an unnecessary
ithread context switch.

sys/xen/xenstore/xenstore.c:
The xenstore driver is MPSAFE. Indicate as much when
registering its interrupt handler.

sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbusvar.h:
Remove unused event channel APIs.

sys/xen/evtchn.h:
Remove all kernel Xen interrupt service API definitions
from this file. It is now only used for structure and
ioctl definitions related to the event channel userland
device driver.

Update the definitions in this file to match those from
NetBSD. Implementing this interface will be necessary for
Dom0 support.

sys/xen/evtchn/evtchnvar.h:
Add a header file for implemenation internal APIs related
to managing event channels event delivery. This is used
to allow, for example, the event channel userland device
driver to access low-level routines that typical kernel
consumers of event channel services should never access.

sys/xen/interface/event_channel.h:
sys/xen/xen_intr.h:
Standardize on the evtchn_port_t type for referring to
an event channel port id. In order to prevent low-level
event channel APIs from leaking to kernel consumers who
should not have access to this data, the type is defined
twice: Once in the Xen provided event_channel.h, and again
in xen/xen_intr.h. The double declaration is protected by
__XEN_EVTCHN_PORT_DEFINED__ to ensure it is never declared
twice within a given compilation unit.

sys/xen/xen_intr.h:
sys/xen/evtchn/evtchn.c:
sys/x86/xen/xen_intr.c:
sys/dev/xen/xenpci/evtchn.c:
sys/dev/xen/xenpci/xenpcivar.h:
New implementation of Xen interrupt services. This is
similar in many respects to the i386 PV implementation with
the exception that events for bound to event channel ports
(i.e. not IPI, virtual IRQ, or physical IRQ) are further
optimized to avoid mask/unmask operations that aren't
necessary for these edge triggered events.

Stubs exist for supporting physical IRQ binding, but will
need additional work before this implementation can be
fully shared between PV and HVM.

sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
sys/i386/xen/mp_machdep.c
sys/x86/xen/hvm.c:
Add support for placing vcpu_info into an arbritary memory
page instead of using HYPERVISOR_shared_info->vcpu_info.
This allows the creation of domains with more than 32 vcpus.

sys/i386/i386/machdep.c:
sys/i386/xen/clock.c:
sys/i386/xen/xen_machdep.c:
sys/i386/xen/exception.s:
Add support for new event channle implementation.


# 785797c3 24-Jul-2013 Andriy Gapon <avg@FreeBSD.org>

rename scheduler->swapper and SI_SUB_RUN_SCHEDULER->SI_SUB_LAST

Also directly call swapper() at the end of mi_startup instead of
relying on swapper being the last thing in sysinits order.

Rationale:

- "RUN_SCHEDULER" was misleading, scheduling already takes place at that stage
- "scheduler" was misleading, the function swaps in the swapped out processes
- another SYSINIT(SI_SUB_RUN_SCHEDULER, SI_ORDER_ANY) could never be
invoked depending on its relative order with scheduler; this was not obvious
and the bug actually used to exist

Reviewed by: kib (ealier version)
MFC after: 14 days


# a8df530d 28-Jan-2013 John Baldwin <jhb@FreeBSD.org>

Mark 'ticks', 'time_second', and 'time_uptime' as volatile to prevent the
compiler from caching their values in tight loops.

Reviewed by: bde
MFC after: 1 week


# a8c9f6fd 19-Oct-2012 Andre Oppermann <andre@FreeBSD.org>

Remove splimp() comment from sysinit table and attribute SI_SUB_PROTO_BEGIN
and SI_SUB_PROTO_END to VNET related initializations.

MFC after: 3 days


# 56d20d01 12-Jun-2012 John Baldwin <jhb@FreeBSD.org>

Replace a reference to the non-existent SI_ORDER_LAST in a comment with
SI_ORDER_ANY.

Submitted by: Brandon Gooch brandongooch yahoo com


# 097055e2 29-Mar-2011 Edward Tomasz Napierala <trasz@FreeBSD.org>

Add racct. It's an API to keep per-process, per-jail, per-loginclass
and per-loginclass resource accounting information, to be used by the new
resource limits code. It's connected to the build, but the code that
actually calls the new functions will come later.

Sponsored by: The FreeBSD Foundation
Reviewed by: kib (earlier version)


# c77715ef 11-Mar-2011 Matthew D Fleming <mdf@FreeBSD.org>

Mostly revert r219468, as I had misremembered the C standard regarding
the size of an extern array.

Keep one change from strncpy to strlcpy.


# cd67ac41 10-Mar-2011 Matthew D Fleming <mdf@FreeBSD.org>

Use MAXPATHLEN rather than the size of an extern array when copying the
kernel name. Also consistenly use strlcpy().

Suggested by: Warner Losh


# 5b24354c 10-Nov-2010 Alexander Motin <mav@FreeBSD.org>

Remove unexisted since r212541 timer1hz/timer2hz variables.


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# dbd55f3f 24-May-2010 Alexander Motin <mav@FreeBSD.org>

- Implement MI helper functions, dividing one or two timer interrupts with
arbitrary frequencies into hardclock(), statclock() and profclock() calls.
Same code with minor variations duplicated several times over the tree for
different timer drivers and architectures.
- Switch all x86 archs to new functions, simplifying the code and removing
extra logic from timer drivers. Other archs are also welcome.


# f47552e7 23-Oct-2009 Ruslan Ermilov <ru@FreeBSD.org>

MFC r198295:

Random number generator initialization cleanup:

- Introduce new SI_SUB_RANDOM point in boot sequence to make it
clear from where one may start using random(9). It should be as
early as possible, so place it just after SI_SUB_CPU where we
have some randomness on most platforms via get_cyclecount().

- Move stack protector initialization to be after SI_SUB_RANDOM
as before this point we have no randomness at all. This fixes
stack protector to actually protect stack with some random guard
value instead of a well-known one.

Note that this patch doesn't try to address arc4random(9) issues.
With current code, it will be implicitly seeded by stack protector
and hence will get the same entropy as random(9). It will be
securely reseeded once /dev/random is feeded by some entropy from
userland.

Submitted by: Maxim Dounin <mdounin@mdounin.ru>
Approved by: re (kib)


# e64585bd 20-Oct-2009 Ruslan Ermilov <ru@FreeBSD.org>

Random number generator initialization cleanup:

- Introduce new SI_SUB_RANDOM point in boot sequence to make it
clear from where one may start using random(9). It should be as
early as possible, so place it just after SI_SUB_CPU where we
have some randomness on most platforms via get_cyclecount().

- Move stack protector initialization to be after SI_SUB_RANDOM
as before this point we have no randomness at all. This fixes
stack protector to actually protect stack with some random guard
value instead of a well-known one.

Note that this patch doesn't try to address arc4random(9) issues.
With current code, it will be implicitly seeded by stack protector
and hence will get the same entropy as random(9). It will be
securely reseeded once /dev/random is feeded by some entropy from
userland.

Submitted by: Maxim Dounin <mdounin@mdounin.ru>
MFC after: 3 days


# d0728d71 23-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Introduce and use a sysinit-based initialization scheme for virtual
network stacks, VNET_SYSINIT:

- Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will
occur each time a network stack is instantiated and destroyed. In the
!VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT.
For the VIMAGE case, we instead use SYSINIT's to track their order and
properties on registration, using them for each vnet when created/
destroyed, or immediately on module load for already-started vnets.
- Remove vnet_modinfo mechanism that existed to serve this purpose
previously, as well as its dependency scheme: we now just use the
SYSINIT ordering scheme.
- Implement VNET_DOMAIN_SET() to allow protocol domains to declare that
they want init functions to be called for each virtual network stack
rather than just once at boot, compiling down to DOMAIN_SET() in the
non-VIMAGE case.
- Walk all virtualized kernel subsystems and make use of these instead
of modinfo or DOMAIN_SET() for init/uninit events. In some cases,
convert modular components from using modevent to using sysinit (where
appropriate). In some cases, do minor rejuggling of SYSINIT ordering
to make room for or better manage events.

Portions submitted by: jhb (VNET_SYSINIT), bz (cleanup)
Discussed with: jhb, bz, julian, zec
Reviewed by: bz
Approved by: re (VIMAGE blanket)


# 5ee847d3 19-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Reimplement and/or implement vnet list locking by replacing a mostly
unused custom mutex/condvar-based sleep locks with two locks: an
rwlock (for non-sleeping use) and sxlock (for sleeping use). Either
acquired for read is sufficient to stabilize the vnet list, but both
must be acquired for write to modify the list.

Replace previous no-op read locking macros, used in various places
in the stack, with actual locking to prevent race conditions. Callers
must declare when they may perform unbounded sleeps or not when
selecting how to lock.

Refactor vnet sysinits so that the vnet list and locks are initialized
before kernel modules are linked, as the kernel linker will use them
for modules loaded by the boot loader.

Update various consumers of these KPIs based on whether they may sleep
or not.

Reviewed by: bz
Approved by: re (kib)


# 76ca6f88 29-May-2009 Jamie Gritton <jamie@FreeBSD.org>

Place hostnames and similar information fully under the prison system.
The system hostname is now stored in prison0, and the global variable
"hostname" has been removed, as has the hostname_mtx mutex. Jails may
have their own host information, or they may inherit it from the
parent/system. The proper way to read the hostname is via
getcredhostname(), which will copy either the hostname associated with
the passed cred, or the system hostname if you pass NULL. The system
hostname can still be accessed directly (and without locking) at
prison0.pr_host, but that should be avoided where possible.

The "similar information" referred to is domainname, hostid, and
hostuuid, which have also become prison parameters and had their
associated global variables removed.

Approved by: bz (mentor)


# 29b02909 08-May-2009 Marko Zec <zec@FreeBSD.org>

Introduce a new virtualization container, provisionally named vprocg, to hold
virtualized instances of hostname and domainname, as well as a new top-level
virtualization struct vimage, which holds pointers to struct vnet and struct
vprocg. Struct vprocg is likely to become replaced in the near future with
a new jail management API import.

As a consequence of this change, change struct ucred to point to a struct
vimage, instead of directly pointing to a vnet.

Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage
branch.

Permit kldload / kldunload operations to be executed only from the default
vimage context.

This change should have no functional impact on nooptions VIMAGE kernel
builds.

Reviewed by: bz
Approved by: julian (mentor)


# bfe1aba4 10-Apr-2009 Marko Zec <zec@FreeBSD.org>

Introduce vnet module registration / initialization framework with
dependency tracking and ordering enforcement.

With this change, per-vnet initialization functions introduced with
r190787 are no longer directly called from traditional initialization
functions (which cc in most cases inlined to pre-r190787 code), but are
instead registered via the vnet framework first, and are invoked only
after all prerequisite modules have been initialized. In the long run,
this framework should allow us to both initialize and dismantle
multiple vnet instances in a correct order.

The problem this change aims to solve is how to replay the
initialization sequence of various network stack components, which
have been traditionally triggered via different mechanisms (SYSINIT,
protosw). Note that this initialization sequence was and still can be
subtly different depending on whether certain pieces of code have been
statically compiled into the kernel, loaded as modules by boot
loader, or kldloaded at run time.

The approach is simple - we record the initialization sequence
established by the traditional mechanisms whenever vnet_mod_register()
is called for a particular vnet module. The vnet_mod_register_multi()
variant allows a single initializer function to be registered multiple
times but with different arguments - currently this is only used in
kern/uipc_domain.c by net_add_domain() with different struct domain *
as arguments, which allows for protosw-registered initialization
routines to be invoked in a correct order by the new vnet
initialization framework.

For the purpose of identifying vnet modules, each vnet module has to
have a unique ID, which is statically assigned in sys/vimage.h.
Dynamic assignment of vnet module IDs is not supported yet.

A vnet module may specify a single prerequisite module at registration
time by filling in the vmi_dependson field of its vnet_modinfo struct
with the ID of the module it depends on. Unless specified otherwise,
all vnet modules depend on VNET_MOD_NET (container for ifnet list head,
rt_tables etc.), which thus has to and will always be initialized
first. The framework will panic if it detects any unresolved
dependencies before completing system initialization. Detection of
unresolved dependencies for vnet modules registered after boot
(kldloaded modules) is not provided.

Note that the fact that each module can specify only a single
prerequisite may become problematic in the long run. In particular,
INET6 depends on INET being already instantiated, due to TCP / UDP
structures residing in INET container. IPSEC also depends on INET,
which will in turn additionally complicate making INET6-only kernel
configs a reality.

The entire registration framework can be compiled out by turning on the
VIMAGE_GLOBALS kernel config option.

Reviewed by: bz
Approved by: julian (mentor)


# 385195c0 10-Dec-2008 Marko Zec <zec@FreeBSD.org>

Conditionally compile out V_ globals while instantiating the appropriate
container structures, depending on VIMAGE_GLOBALS compile time option.

Make VIMAGE_GLOBALS a new compile-time option, which by default will not
be defined, resulting in instatiations of global variables selected for
V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be
effectively compiled out. Instantiate new global container structures
to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0,
vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.

Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_
macros resolve either to the original globals, or to fields inside
container structures, i.e. effectively

#ifdef VIMAGE_GLOBALS
#define V_rt_tables rt_tables
#else
#define V_rt_tables vnet_net_0._rt_tables
#endif

Update SYSCTL_V_*() macros to operate either on globals or on fields
inside container structs.

Extend the internal kldsym() lookups with the ability to resolve
selected fields inside the virtualization container structs. This
applies only to the fields which are explicitly registered for kldsym()
visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently
this is done only in sys/net/if.c.

Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code,
and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in
turn result in proper code being generated depending on VIMAGE_GLOBALS.

De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c
which were prematurely V_irtualized by automated V_ prepending scripts
during earlier merging steps. PF virtualization will be done
separately, most probably after next PF import.

Convert a few variable initializations at instantiation to
initialization in init functions, most notably in ipfw. Also convert
TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in
initializer functions.

Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 040b1db9 19-Aug-2008 Ed Schouten <ed@FreeBSD.org>

Remove the now unused `lbolt' variable from the kernel.

We used to have a single wait channel inside the kernel which could be
used by threads that just wanted to sleep for some time (the next
second). The old TTY layer was the only piece of code that still used
lbolt, because I already removed the use of lbolt from the NFS clients
and the VFS syncer.

Approved by: philip


# 7f41115e 21-Jul-2008 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Implement the following macros for completeness:

SYSCTL_QUAD()
SYSCTL_ADD_QUAD()
TUNABLE_QUAD()
TUNABLE_QUAD_FETCH()

Now we can use 64bit tunables on 32bit systems.


# 4f7d1876 05-Jul-2008 Robert Watson <rwatson@FreeBSD.org>

Introduce a new lock, hostname_mtx, and use it to synchronize access
to global hostname and domainname variables. Where necessary, copy
to or from a stack-local buffer before performing copyin() or
copyout(). A few uses, such as in cd9660 and daemon_saver, remain
under-synchronized and will require further updates.

Correct a bug in which a failed copyin() of domainname would leave
domainname potentially corrupted.

MFC after: 3 weeks


# 69b2c659 18-May-2008 John Birrell <jb@FreeBSD.org>

Add sysinit levels for DTrace.


# 00c71fb7 08-Apr-2008 Sam Leffler <sam@FreeBSD.org>

o add a mountroot event handler that fires when / is mounted; this information
was lost when root started being mounted by init
o remove SI_SUB_MOUNT_ROOT since it's no longer meaningful

MFC after: 2 weeks


# dd3af71f 16-Mar-2008 Robert Watson <rwatson@FreeBSD.org>

Remove trailing ';' from C_SYSINIT() macro definition, in keeping
with style(9) recommendation that macros not contain the
terminating ';', leaving that to the invoker. All SYSINIT()
consumers must now provide a trailing ';'.

Unlike the change to remove the ';'s from callers, this change
shouldn't be MFC'd unless we don't mind requiring source changes
to third party modules that might still depend on SYSINIT()
providing its own ';'.


# 55c3064e 25-Dec-2007 Robert Watson <rwatson@FreeBSD.org>

Add a new kernel startup event for DDB services, which will include DDB
output capture, scripting, and textdumps.


# e3709a56 28-Nov-2007 John Birrell <jb@FreeBSD.org>

Remove _SOLARIS_C_SOURCE compatibility definitions. Unfortunately the
ZFS porting style didn't extend this, instead using a heap of additional
header files that don't get installed.

My intention had been to allow OpenSolaris external code to build on
FreeBSD out of the box (i.e. without a src tree).


# 33d2bb9c 27-Jul-2007 Robert Watson <rwatson@FreeBSD.org>

First in a series of changes to remove the now-unused Giant compatibility
framework for non-MPSAFE network protocols:

- Remove debug_mpsafenet variable, sysctl, and tunable.
- Remove NET_NEEDS_GIANT() and associate SYSINITSs used by it to force
debug.mpsafenet=0 if non-MPSAFE protocols are compiled into the kernel.
- Remove logic to automatically flag interrupt handlers as non-MPSAFE if
debug.mpsafenet is set for an INTR_TYPE_NET handler.
- Remove logic to automatically flag netisr handlers as non-MPSAFE if
debug.mpsafenet is set.
- Remove references in a few subsystems, including NFS and Cronyx drivers,
which keyed off debug_mpsafenet to determine various aspects of their own
locking behavior.
- Convert NET_LOCK_GIANT(), NET_UNLOCK_GIANT(), and NET_ASSERT_GIANT into
no-op's, as their entire behavior was determined by the value in
debug_mpsafenet.
- Alias NET_CALLOUT_MPSAFE to CALLOUT_MPSAFE.

Many remaining references to NET_.*_GIANT() and NET_CALLOUT_MPSAFE are still
present in subsystems, and will be removed in followup commits.

Reviewed by: bz, jhb
Approved by: re (kensmith)


# 6db10720 09-Apr-2007 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Fix build breakage.


# 82068fe7 09-Apr-2007 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Add kern.hostuuid sysctl, which will be used to keep host's UUID.

Reviewed by: mlaier, rink, brooks, rwatson


# 24c3c19e 05-Apr-2007 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Hide lbolt under _SOLARIS_C_SOURCE in preparation for ZFS import.
I really couldn't avoid this with preprocessor magic.


# f645b0b5 01-Oct-2006 Poul-Henning Kamp <phk@FreeBSD.org>

First part of a little cleanup in the calendar/timezone/RTC handling.

Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.


# 03e161fd 01-Aug-2006 John Baldwin <jhb@FreeBSD.org>

Make system call modules a bit more robust:
- If we fail to register the system call during MOD_LOAD, then note that
so that we don't try to deregister it or invoke the chained event handler
during the subsequent MOD_UNLOAD event. Doing the deregister when the
register failed could result in trashing system call entries.
- Add a SI_SUB_SYSCALLS just before starting up init and use that to
register syscall modules instead of SI_SUB_DRIVERS. Registering system
calls as late as possible increases the chances that any other module
event handlers or SYSINITs in a module are executed to initialize the
data in a kld before a syscall dependent on that data is able to be
invoked.

MFC after: 3 days


# 9806934e 26-May-2006 Poul-Henning Kamp <phk@FreeBSD.org>

Be less harsh on brueffers eyes :-)


# 0aa6cf76 26-May-2006 Poul-Henning Kamp <phk@FreeBSD.org>

Remove SI_SUB_CONSOLE, porting from 4.4-Lite is no longer an issue.


# 8df65d80 24-May-2006 Ruslan Ermilov <ru@FreeBSD.org>

GC long unused hostnamelen and domainnamelen.

Submitted by: Alex Lyashkov <shadow@psoft.net>


# d1dfd921 05-Sep-2005 Christian S.J. Peron <csjp@FreeBSD.org>

Convert the primary ACL allocator from malloc(9) to using a UMA zone instead.
Also introduce an aclinit function which will be used to create the UMA zone
for use by file systems at system start up.

MFC after: 1 month
Discussed with: rwatson


# 7f964aa6 17-Apr-2005 Stefan Farfeleder <stefanf@FreeBSD.org>

Use __CONCAT() in the TUNABLE_ macros, this way we don't have to use 3
macros per type.


# 0f7d553f 16-Apr-2005 Stefan Farfeleder <stefanf@FreeBSD.org>

Concatenate the line number rather than the string `__FILE__' in the
NET_NEEDS_GIANT macro. Until now this wasn't a problem because no
translation unit contains NET_NEEDS_GIANT more than once.


# f2ccd228 02-Feb-2005 Robert Watson <rwatson@FreeBSD.org>

Define SI_SUB_AUDIT, the system boot event to initialize the audit
subsystem.

Obtained from: TrustedBSD Project
Submitted by: Wayne Salamon <wsalamon at computer dot org>


# aa38f8f9 06-Jan-2005 Maksim Yevmenkin <emax@FreeBSD.org>

Introduce new startup level SI_SUB_NETGRAPH that is after
SI_SUB_INIT_IF but before SI_SUB_DRIVERS. Make Netgraph(4)
framework initialize at SI_SUB_NETGRAPH level.

This does not address the bigger problem: MODULE_DEPEND
does not seem to work when modules are compiled in the
kernel, but it fixes the problem with Netgraph Bluetooth
device drivers reported by a few folks.

PR: i386/69876
Reviewed by: julian, rik, scottl
MFC after: 3 days


# c27b9a8d 06-Dec-2004 Pawel Jakub Dawidek <pjd@FreeBSD.org>

We don't have RAIDFrame anymore and it seems gvinum doesn't use SI_SUB_RAID,
so correct stale comment. The only SI_SUB_RAID consumer is gmirror right now.


# ed3fdd0e 08-Nov-2004 Dag-Erling Smørgrav <des@FreeBSD.org>

Retire TUNABLE_QUAD_*.


# b0e1e474 31-Oct-2004 Dag-Erling Smørgrav <des@FreeBSD.org>

Add TUNABLE_LONG and TUNABLE_ULONG, and use the latter for the
hw.pci.host_mem_start tunable. Add comments to TUNABLE_INT and
TUNABLE_QUAD recommending against their use.

MFC after: 3 weeks


# 38228f72 31-Oct-2004 Dag-Erling Smørgrav <des@FreeBSD.org>

Whitespace cleanup


# 1d8cd39e 28-Aug-2004 Robert Watson <rwatson@FreeBSD.org>

Change the default disposition of debug.mpsafenet from 0 to 1, which
will cause the network stack to operate without the Giant lock by
default. This change has the potential to improve performance by
increasing parallelism and decreasing latency in network processing.

Due to the potential exposure of existing or new bugs, the following
compatibility functionality is maintained:

- It is still possible to disable Giant-free operation by setting
debug.mpsafenet to 0 in loader.conf.

- Add "options NET_WITH_GIANT", which will restore the default value of
debug.mpsafenet to 0, and is intended for use on systems compiled with
known unsafe components, or where a more conservative configuration is
desired.

- Add a new declaration, NET_NEEDS_GIANT("componentname"), which permits
kernel components to declare dependence on Giant over the network
stack. If the declaration is made by a preloaded module or a compiled
in component, the disposition of debug.mpsafenet will be set to 0 and
a warning concerning performance degraded operation printed to the
console. If it is declared by a loadable kernel module after boot, a
warning is displayed but the disposition cannot be changed. This is
implemented by defining a new SYSINIT() value, SI_SUB_SETTINGS, which
is intended for the processing of configuration choices after tunables
are read in and the console is available to generate errors, but
before much else gets going.

This compatibility behavior will go away when we've finished the last
of the locking work and are confident that operation is correct.


# 42e7a191 18-Jun-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove nested include of <sys/module.h>

Should be happier now: peter


# b033c30b 10-Mar-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove the /* 1.2 */ comment which was orphaned by previous commit.


# 7314dafb 29-Feb-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Move boottime from <sys/kernel.h> to <sys/time.h> where it belongs.


# d3e1c241 13-Feb-2004 Nate Lawson <njl@FreeBSD.org>

Fix hw.acpi.os_name by renaming it to hw.acpi.osname. The "_name" suffix
is reserved by the loader, and thus any tunable name with that suffix will
be silently discarded.

Document this in the header and man page so that other developers do not
develop so many bumps on the head after banging it against the wall.

Detective work by: Mark Santcroos, grehan


# 31b1bfe1 17-Oct-2003 Hajimu UMEMOTO <ume@FreeBSD.org>

- add dom_if{attach,detach} framework.
- transition to use ifp->if_afdata.

Obtained from: KAME


# 6ff1481d 15-Jul-2003 Don Lewis <truckman@FreeBSD.org>

Rearrange the SYSINIT order to call lockmgr_init() earlier so that
the runtime lockmgr initialization code in lockinit() can be eliminated.

Reviewed by: jhb


# 857d9c60 12-Jul-2003 Don Lewis <truckman@FreeBSD.org>

Extend the mutex pool implementation to permit the creation and use of
multiple mutex pools with different options and sizes. Mutex pools can
be created with either the default sleep mutexes or with spin mutexes.
A dynamically created mutex pool can now be destroyed if it is no longer
needed.

Create two pools by default, one that matches the existing pool that
uses the MTX_NOWITNESS option that should be used for building higher
level locks, and a new pool with witness checking enabled.

Modify the users of the existing mutex pool to use the appropriate pool
in the new implementation.

Reviewed by: jhb


# 4074deda 03-Jun-2003 Greg Lehey <grog@FreeBSD.org>

Remove SI_SUB_VINUM. SI_SUB_RAID makes more sense.

Submitted by: hmp


# 91f1c2b3 03-Feb-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Split the global timezone structure into two integer fields to
prevent the compiler from optimizing assignments into byte-copy
operations which might make access to the individual fields non-atomic.

Use the individual fields throughout, and don't bother locking them with
Giant: it is no longer needed.

Inspired by: tjr


# 238dd320 03-Feb-2003 Jake Burkholder <jake@FreeBSD.org>

Split statclock into statclock and profclock, and made the method for driving
statclock based on profhz when profiling is enabled MD, since most platforms
don't use this anyway. This removes the need for statclock_process, whose
only purpose was to subdivide profhz, and gets the profiling clock running
outside of sched_lock on platforms that implement suswintr.
Also changed the interface for starting and stopping the profiling clock to
do just that, instead of changing the rate of statclock, since they can now
be separate.

Reviewed by: jhb, tmm
Tested on: i386, sparc64


# f9d186ed 20-Oct-2002 Scott Long <scottl@FreeBSD.org>

After much delay and anticipation, welcome RAIDFrame into the FreeBSD
world. This should be considered highly experimental.

Approved-by: re


# 609d4656 06-Jun-2002 John Baldwin <jhb@FreeBSD.org>

Add a new SYSINIT subsystem for KTRACE.


# d394511d 16-May-2002 Tom Rhodes <trhodes@FreeBSD.org>

More s/file system/filesystem/g


# 4a7edf69 14-May-2002 Robert Watson <rwatson@FreeBSD.org>

Strategic diff reduction against TrustedBSD MAC branch: introduce an
additional system boot ordering entry, SI_SUB_MAC_LATE, which occurs
after all MAC policies have been initialized, permitting the MAC
subsystem to take action once all "early loaded" modules are in place.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


# 1a924d72 21-Apr-2002 Mark Murray <markm@FreeBSD.org>

Parenthesise macro arguments to reduce lint warnings.


# e1d970f1 14-Apr-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Improve the implementation of adjtime(2).

Apply the change as a continuous slew rather than as a series of
discrete steps and make it possible to adjust arbitraryly huge
amounts of time in either direction.

In practice this is done by hooking into the same once-per-second
loop as the NTP PLL and setting a suitable frequency offset deducting
the amount slewed from the remainder. If the remaining delta is
larger than 1 second we slew at 5000PPM (5msec/sec), for a delta
less than a second we slew at 500PPM (500usec/sec) and for the last
one second period we will slew at whatever rate (less than 500PPM)
it takes to eliminate the delta entirely.

The old implementation stepped the clock a number of microseconds
every HZ to acheive the same effect, using the same rates of change.

Eliminate the global variables tickadj, tickdelta and timedelta and
their various use and initializations.

This removes the most significant obstacle to running timecounter and
NTP housekeeping from a timeout rather than hardclock.


# 789f12fe 19-Mar-2002 Alfred Perlstein <alfred@FreeBSD.org>

Remove __P


# 0e0af8ec 13-Mar-2002 Brian Feldman <green@FreeBSD.org>

Rename SI_SUB_MUTEX to SI_SUB_MTX_POOL to make the name at all accurate.
While doing this, move it earlier in the sysinit boot process so that the
VM system can use it.

After that, the system is now able to use sx locks instead of lockmgr
locks in the VM system. To accomplish this, some of the more
questionable uses of the locks (such as testing whether they are
owned or not, as well as allowing shared+exclusive recursion) are
removed, and simpler logic throughout is used so locks should also be
easier to understand.

This has been tested on my laptop for months, and has not shown any
problems on SMP systems, either, so appears quite safe. One more
user of lockmgr down, many more to go :)


# 7e595f76 05-Mar-2002 Robert Watson <rwatson@FreeBSD.org>

Merge reservation of two SI_SUB constants for the MAC policy framework
and for individual MAC policies. The framework event initializes the
access control subsystem; the policy event allows policies to register
themselves. The gap in between is for all the things we'll think of
later.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


# a7489fe5 08-Jan-2002 Mike Smith <msmith@FreeBSD.org>

Add a new sysinit SI_SUB_DEVFS. Devfs hooks into the kernel at SI_ORDER_FIRST,
and devices can be created anytime after that.

Print a warning if an atttempt is made to create a device too early.


# f2860039 13-Nov-2001 Matthew Dillon <dillon@FreeBSD.org>

Create a mutex pool API for short term leaf mutexes.
Replace the manual mutex pool in kern_lock.c (lockmgr locks) with the new API.
Replace the mutexes embedded in sxlocks with the new API.


# f9390180 01-Nov-2001 Mitsuru IWASAKI <iwasaki@FreeBSD.org>

Some fix for the recent apm module changes.
- Now that apm loadable module can inform its existence to other kernel
components (e.g. i386/isa/clock.c:startrtclock()'s TCS hack).
- Exchange priority of SI_SUB_CPU and SI_SUB_KLD for above purpose.
- Add simple arbitration mechanism for APM vs. ACPI. This prevents
the kernel enables both of them.
- Remove obsolete `#ifdef DEV_APM' related code.
- Add abstracted interface for Powermanagement operations. Public apm(4)
functions, such as apm_suspend(), should be replaced new interfaces.
Currently only power_pm_suspend (successor of apm_suspend) is implemented.

Reviewed by: peter, arch@ and audit@


# d19fc02a 23-Oct-2001 John Baldwin <jhb@FreeBSD.org>

Change TUNABLE_*_FETCH to have a return value of 0 if the variable was not
found or successfully converted and true otherwise.


# cbc89bfb 10-Oct-2001 Paul Saab <ps@FreeBSD.org>

Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader
tunable.

Reviewed by: peter
MFC after: 2 weeks


# f9132ceb 05-Sep-2001 Jonathan Lemon <jlemon@FreeBSD.org>

Wrap array accesses in macros, which also happen to be lvalues:

ifnet_addrs[i - 1] -> ifaddr_byindex(i)
ifindex2ifnet[i] -> ifnet_byindex(i)

This is intended to ease the conversion to SMPng.


# d4d79f27 22-Jun-2001 Matt Jacob <mjacob@FreeBSD.org>

Make hostid an unsigned long (matches kern_mib.c change)

PR: kern/21132
MFC after: 1 month


# f41325db 13-Jun-2001 Peter Wemm <peter@FreeBSD.org>

With this commit, I hereby pronounce gensetdefs past its use-by date.

Replace the a.out emulation of 'struct linker_set' with something
a little more flexible. <sys/linker_set.h> now provides macros for
accessing elements and completely hides the implementation.

The linker_set.h macros have been on the back burner in various
forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()),
John Polstra (ELF clue) and myself (cleaned up API and the conversion
of the rest of the kernel to use it).

The macros declare a strongly typed set. They return elements with the
type that you declare the set with, rather than a generic void *.

For ELF, we use the magic ld symbols (__start_<setname> and
__stop_<setname>). Thanks to Richard Henderson <rth@redhat.com> for the
trick about how to force ld to provide them for kld's.

For a.out, we use the old linker_set struct.

NOTE: the item lists are no longer null terminated. This is why
the code impact is high in certain areas.

The runtime linker has a new method to find the linker set
boundaries depending on which backend format is in use.

linker sets are still module/kld unfriendly and should never be used
for anything that may be modular one day.

Reviewed by: eivind


# 09786698 07-Jun-2001 Peter Wemm <peter@FreeBSD.org>

"Fix" the previous initial attempt at fixing TUNABLE_INT(). This time
around, use a common function for looking up and extracting the tunables
from the kernel environment. This saves duplicating the same function
over and over again. This way typically has an overhead of 8 bytes + the
path string, versus about 26 bytes + the path string.


# 4422746f 06-Jun-2001 Peter Wemm <peter@FreeBSD.org>

Back out part of my previous commit. This was a last minute change
and I botched testing. This is a perfect example of how NOT to do
this sort of thing. :-(


# 81930014 06-Jun-2001 Peter Wemm <peter@FreeBSD.org>

Make the TUNABLE_*() macros look and behave more consistantly like the
SYSCTL_*() macros. TUNABLE_INT_DECL() was an odd name because it didn't
actually declare the int, which is what the name suggests it would do.


# 840e78b8 23-May-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Use the correct enums in struct sysinit.


# abd9053e 16-Apr-2001 John Baldwin <jhb@FreeBSD.org>

Blow away the panic mutex in favor of using a single atomic_cmpset() on a
panic_cpu shared variable. I used a simple atomic operation here instead
of a spin lock as it seemed to be excessive overhead. Also, this can avoid
recursive panics if, for example, witness is broken.


# 19284646 28-Mar-2001 John Baldwin <jhb@FreeBSD.org>

Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.


# d888fc4e 11-Feb-2001 Mark Murray <markm@FreeBSD.org>

RIP <machine/lock.h>.

Some things needed bits of <i386/include/lock.h> - cy.c now has its
own (only) copy of the COM_(UN)LOCK() macros, and IMASK_(UN)LOCK()
has been moved to <i386/include/apic.h> (AKA <machine/apic.h>).
Reviewed by: jhb


# 3687f15b 09-Feb-2001 John Baldwin <jhb@FreeBSD.org>

Add a new SYSINIT for interrupt thread initialization and stick
initialization right after it.


# 19b61693 04-Feb-2001 Peter Wemm <peter@FreeBSD.org>

Pull the rug from under the 'LKM Compatability' macro - PSEUDO_SET().
There are two 3rd party code chunks using this still - the IPv6 stuff and
i4b. Give them a private copy as an alternative to changing them too much.

XXX sys/kernel.h still has a #include <sys/module.h> in it. I will be
taking this out shortly - this affects a number of drivers.


# e7ed3882 30-Jan-2001 John Baldwin <jhb@FreeBSD.org>

Argh, fix a nit that snuck in while trying to resolve conflicts.


# 01fe326e 30-Jan-2001 John Baldwin <jhb@FreeBSD.org>

- Fix TUNABLE_STR_FETCH() to actually be a code fragment rather than
declaring a static function.
- Modify TUNABLE_*_DECL() to use TUNABLE_*_FETCH() to avoid code
duplication.

Reviewed by: peter


# bc234bdc 26-Jan-2001 Peter Wemm <peter@FreeBSD.org>

Bah, as my luck would have it, I had a kernel source tree in the window
while strlcpy() existed, before it got backed out due to an extended
bikeshed argument. Sigh. Back to the old version with the redundant
code to terminate the string. :-(


# d2e32c67 26-Jan-2001 Peter Wemm <peter@FreeBSD.org>

Use strlcpy() in TUNABLE_STR_xxx() and avoid an off-by-one.

Noticed by: dfr


# d1c1b841 21-Jan-2001 Jason Evans <jasone@FreeBSD.org>

Remove MUTEX_DECLARE() and MTX_COLD. Instead, postpone full mutex
initialization until after malloc() is safe to call, then iterate through
all mutexes and complete their initialization.

This change is necessary in order to avoid some circular bootstrapping
dependencies.


# 4058c0f0 28-Dec-2000 Peter Wemm <peter@FreeBSD.org>

Pull out the module path from the loader. ie: if you boot from
/boot/kernel.foobar/* then that had better be in the path ahead of the
others.

Submitted by: Daniel J. O'Connor <darius@dons.net.au>
PR: 23662


# 06592dd1 11-Dec-2000 John Baldwin <jhb@FreeBSD.org>

- Convert the per-eventhandler list mutex to a lockmgr lock so that it can
be safely held across an eventhandler function call.
- Fix an instance of the head of an eventhandler list being read without
the lock being held.
- Break down and use a SYSINIT at the new SI_SUB_EVENTHANDLER to initialize
the eventhandler global mutex and the eventhandler list of lists rather
than using a non-MP safe initialization during the first call to
eventhandler_register().
- Add in a KASSERT() to eventhandler_register() to ensure that we don't try
to register an eventhandler before things have been initialized.


# 32a48fa0 20-Oct-2000 John Baldwin <jhb@FreeBSD.org>

Revert the init_clocks change in revision 1.72. On the alpha we use an
ISA device for our clock, so trying to initialize the clock before probing
devices introduces a chicken and egg problem.

Debug help from: peter


# 9b57bb4e 19-Oct-2000 Peter Wemm <peter@FreeBSD.org>

execsw_set hasn't been used for a while and does not exist.


# cbbee1e4 19-Oct-2000 John Baldwin <jhb@FreeBSD.org>

Move init_clocks earlier in the system startup so that hardclock and clock
interrupts are started before the device probe. This allows interrupt
threads to run during the device probe among other things.


# 9722d88f 12-Oct-2000 Jason Evans <jasone@FreeBSD.org>

For lockmgr mutex protection, use an array of mutexes that are allocated
and initialized during boot. This avoids bloating sizeof(struct lock).
As a side effect, it is no longer necessary to enforce the assumtion that
lockinit()/lockdestroy() calls are paired, so the LK_VALID flag has been
removed.

Idea taken from: BSD/OS.


# 7d032714 30-Sep-2000 Bosko Milekic <bmilekic@FreeBSD.org>

Big mbuf subsystem diff #1: incorporate mutexes and fix things up somewhat
to accomodate the changes.

Here's a list of things that have changed (I may have left out a few); for a
relatively complete list, see http://people.freebsd.org/~bmilekic/mtx_journal

* Remove old (once useful) mcluster code for MCLBYTES > PAGE_SIZE which
nobody uses anymore. It was great while it lasted, but now we're moving
onto bigger and better things (Approved by: wollman).

* Practically re-wrote the allocation macros in sys/sys/mbuf.h to accomodate
new allocations which grab the necessary lock.

* Make sure that necessary mbstat variables are manipulated with
corresponding atomic() routines.

* Changed the "wait" routines, cleaned it up, made one routine that does
the job.

* Generalized MWAKEUP() macro. Got rid of m_retry and m_retryhdr, as they
are now included in the generalized "wait" routines.

* Sleep routines now use msleep().

* Free lists have locks.

* etc... probably other stuff I'm missing...

Things to look out for and work on later:

* find a better way to (dynamically) adjust EXT_COUNTERS

* move necessity to recurse on a lock from drain routines by providing
lock-free lower-level version of MFREE() (and possibly m_free()?).

* checkout include of mutex.h in sys/sys/mbuf.h - probably violating
general philosophy here.

The code has been reviewed quite a bit, but problems may arise... please,
don't panic! Send me Emails: bmilekic@freebsd.org

Reviewed by: jlemon, cp, alfred, others?


# 7cedb9d1 21-Sep-2000 John Baldwin <jhb@FreeBSD.org>

Fixing a sorting error in teh subsystem list. 7 < 8, not 8 < 7.


# 0384fff8 06-Sep-2000 Jason Evans <jasone@FreeBSD.org>

Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The
alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
preempted (i386 only).

Partially contributed by: BSDi (BSD/OS)
Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh


# 3f54a085 20-Aug-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Remove all traces of Julians DEVFS (incl from kern/subr_diskslice.c)

Remove old DEVFS support fields from dev_t.

Make uid, gid & mode members of dev_t and set them in make_dev().

Use correct uid, gid & mode in make_dev in disk minilayer.

Add support for registering alias names for a dev_t using the
new function make_dev_alias(). These will show up as symlinks
in DEVFS.

Use makedev() rather than make_dev() for MFSs magic devices to prevent
DEVFS from noticing this abuse.

Add a field for DEVFS inode number in dev_t.

Add new DEVFS in fs/devfs.

Add devfs cloning to:
disk minilayer (ie: ad(4), sd(4), cd(4) etc etc)
md(4), tun(4), bpf(4), fd(4)

If DEVFS add -d flag to /sbin/inits args to make it mount devfs.

Add commented out DEVFS to GENERIC


# babff34a 11-Aug-2000 Peter Wemm <peter@FreeBSD.org>

Oops, forgot this file. Log message for completeness:
Clean up some low level bootstrap code:

- stop using the evil 'struct trapframe' argument for mi_startup()
(formerly main()). There are much better ways of doing it.
- do not use prepare_usermode() - setregs() in execve() will do it
all for us as long as the p_md.md_regs pointer is set. (which is
now done in machdep.c rather than init_main.c. The Alpha port did it
this way all along and is much cleaner).
- collect all the magic %cr0 etc register settings into one place and
have the AP's call that instead of using magic numbers (!!) that keep
changing over and over again.
- Make it safe to call kthread_create() earlier, including during the
device probe sequence. It doesn't need the callback mechanism that
NetBSD's version uses.
- kthreads created this way are root-less as they exist before the root
filesystem is mounted. init(1) is set up so that it aquires the root
pointers prior to running. If other kthreads want filesystem acccess
we can make this code more generic.
- set all threads start times once we have decided what time it is.
- init uses a trampoline rather than the evil prepare_usermode() hack.
- kern_descrip.c has a couple of tweaks to deal with forking when there
is no rootdir or cwd etc.
- adjust the early SYSINIT() sequence so that a few prereqisites are in
place. eg: make sure the run queue is initialized before doing forks.

With this, the USB code can easily create a kthread to do the device
tree discovery. (I have tested it, it works nicely).

There are still some open issues before this is truely useful.
- tsleep() does not like working before the clock is running. It
sort-of tries to spin wait, but it can do more useful things now.
- stopping a kthread in kld code at unload time is "interesting" but
we have a solution for that.

The Alpha code needs no changes for this. It already uses pretty much the
same strategies, but a little cleaner.


# e3975643 25-May-2000 Jake Burkholder <jake@FreeBSD.org>

Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by: msmith and others


# 740a1973 23-May-2000 Jake Burkholder <jake@FreeBSD.org>

Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by: phk
Reviewed by: phk
Approved by: mdodd


# 047e0b4e 28-Feb-2000 Greg Lehey <grog@FreeBSD.org>

Add SI_SUB_VINUM startup sequence for Vinum. This is part of Vinum
root file system support.

Approved-by: jkh


# 1f6889a1 16-Feb-2000 Matthew Dillon <dillon@FreeBSD.org>

Fix null-pointer dereference crash when the system is intentionally
run out of KVM through a mmap()/fork() bomb that allocates hundreds
of thousands of vm_map_entry structures.

Add panic to make null-pointer dereference crash a little more verbose.

Add a new sysctl, vm.max_proc_mmap, which specifies the maximum number
of mmap()'d spaces (discrete vm_map_entry's in the process). The value
defaults to around 9000 for a 128MB machine. The test is scaled for the
number of processes sharing a vmspace (aka linux threads). Setting
the value to 0 disables the feature.

PR: kern/16573
Approved by: jkh


# 664a31e4 28-Dec-1999 Peter Wemm <peter@FreeBSD.org>

Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.


# 3e2c6ca3 05-Oct-1999 Nick Hibma <n_hibma@FreeBSD.org>

Removal of sys/device.h

- Move intrhook stuff into kernel.h
- Remove all occurrences of #device <device.h>
- Add kernel.h were necessary (nowhere)
- delete device.h

This file contained the structures for cfdata (old style config) and is no
longer used. It was included by most drivers.

It confuses the remote debugger as the definition of 'struct device' in
device.h is found before the one in bus_private.h.


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# 9f6f6f71 05-Jul-1999 Mike Smith <msmith@FreeBSD.org>

A couple of new macros to make implementing tunable values slightly easier.


# e929c00d 03-Jul-1999 Kirk McKusick <mckusick@FreeBSD.org>

The buffer queue mechanism has been reformulated. Instead of having
QUEUE_AGE, QUEUE_LRU, and QUEUE_EMPTY we instead have QUEUE_CLEAN,
QUEUE_DIRTY, QUEUE_EMPTY, and QUEUE_EMPTYKVA. With this patch clean
and dirty buffers have been separated. Empty buffers with KVM
assignments have been separated from truely empty buffers. getnewbuf()
has been rewritten and now operates in a 100% optimal fashion. That is,
it is able to find precisely the right kind of buffer it needs to
allocate a new buffer, defragment KVM, or to free-up an existing buffer
when the buffer cache is full (which is a steady-state situation for
the buffer cache).

Buffer flushing has been reorganized. Previously buffers were flushed
in the context of whatever process hit the conditions forcing buffer
flushing to occur. This resulted in processes blocking on conditions
unrelated to what they were doing. This also resulted in inappropriate
VFS stacking chains due to multiple processes getting stuck trying to
flush dirty buffers or due to a single process getting into a situation
where it might attempt to flush buffers recursively - a situation that
was only partially fixed in prior commits. We have added a new daemon
called the buf_daemon which is responsible for flushing dirty buffers
when the number of dirty buffers exceeds the vfs.hidirtybuffers limit.
This daemon attempts to dynamically adjust the rate at which dirty buffers
are flushed such that getnewbuf() calls (almost) never block.

The number of nbufs and amount of buffer space is now scaled past the
8MB limit that was previously imposed for systems with over 64MB of
memory, and the vfs.{lo,hi}dirtybuffers limits have been relaxed
somewhat. The number of physical buffers has been increased with the
intention that we will manage physical I/O differently in the future.

reassignbuf previously attempted to keep the dirtyblkhd list sorted which
could result in non-deterministic operation under certain conditions,
such as when a large number of dirty buffers are being managed. This
algorithm has been changed. reassignbuf now keeps buffers locally sorted
if it can do so cheaply, and otherwise gives up and adds buffers to
the head of the dirtyblkhd list. The new algorithm is deterministic but
not perfect. The new algorithm greatly reduces problems that previously
occured when write_behind was turned off in the system.

The P_FLSINPROG proc->p_flag bit has been replaced by the more descriptive
P_BUFEXHAUST bit. This bit allows processes working with filesystem
buffers to use available emergency reserves. Normal processes do not set
this bit and are not allowed to dig into emergency reserves. The purpose
of this bit is to avoid low-memory deadlocks.

A small race condition was fixed in getpbuf() in vm/vm_pager.c.

Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by: Kirk McKusick <mckusick@mckusick.com>


# 9c8b8baa 01-Jul-1999 Peter Wemm <peter@FreeBSD.org>

Slight reorganization of kernel thread/process creation. Instead of using
SYSINIT_KT() etc (which is a static, compile-time procedure), use a
NetBSD-style kthread_create() interface. kproc_start is still available
as a SYSINIT() hook. This allowed simplification of chunks of the
sysinit code in the process. This kthread_create() is our old kproc_start
internals, with the SYSINIT_KT fork hooks grafted in and tweaked to work
the same as the NetBSD one.

One thing I'd like to do shortly is get rid of nfsiod as a user initiated
process. It makes sense for the nfs client code to create them on the
fly as needed up to a user settable limit. This means that nfsiod
doesn't need to be in /sbin and is always "available". This is a fair bit
easier to do outside of the SYSINIT_KT() framework.


# 442e6437 06-May-1999 Peter Wemm <peter@FreeBSD.org>

Move the proc0 init before the driver probe/attach etc since machdep.c
doesn't set curproc anymore, and certain drivers like to tsleep() during
probes, usb for example.


# e9189611 17-Apr-1999 Peter Wemm <peter@FreeBSD.org>

Well folks, this is it - The second stage of the removal for build support
for LKM's..


# bc814931 29-Jan-1999 Matthew Dillon <dillon@FreeBSD.org>

More const fixes for -Wall, -Wcast-qual


# 12cb7f73 29-Jan-1999 Matthew Dillon <dillon@FreeBSD.org>

Commit a solution for the SYSINIT vs C_SYSINIT conundrum. The
problem and solution is outlined in the comments, but basically
we needed a way to allow the SYSINIT mechanism to handle const void *
arguments and function pointers as well as non-const arguments and
function pointers while still maintaining the compiler's ability to
issue warnings if you try to use a bad combination.


# 5a24726b 28-Jan-1999 Matthew Dillon <dillon@FreeBSD.org>

Clarify the SYSINIT problem by breaking SYSINIT's up into a void *
version and a const void * version. Currently the const void * version
simply calls the void * version ( i.e. no 'fix' is in place ).

A solution needs to be found for the C_SYSINIT ( etc...) family of
macros that allows const void * without generating a warning, but
does not allow non-const void *.


# 8aef1712 27-Jan-1999 Matthew Dillon <dillon@FreeBSD.org>

Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile


# 045205a8 13-Jan-1999 John Polstra <jdp@FreeBSD.org>

Move the linker set definitions into a new header file
<sys/linker_set.h>. <sys/kernel.h> includes the new header, so
functionally everything is still the same.


# 420d68e8 20-Dec-1998 Bruce Evans <bde@FreeBSD.org>

Backed out rev.1.46. It had no effect for aout, was incomplete for elf,
and had gratuitous gcc dependencies. Rev.1.47 has a better fix.


# f2d4e36c 20-Dec-1998 Doug Rabson <dfr@FreeBSD.org>

Add a workaround to avoid 'defined but not used' warnings for linker
sets on i386 and alpha ELF kernels.


# a05e9b73 03-Dec-1998 John Birrell <jb@FreeBSD.org>

Add __attribute__ ((unused)) to the SYSINIT etc macros which declare
static structures that are used with the data set magic. This allows
kernel modules, for example, to be compiled with -Wall -Werror.


# a9ef8bf7 15-Nov-1998 Bruce Evans <bde@FreeBSD.org>

Don't generate module event handlers of the wrong (old) type.

Fixed some pedantic syntax errors (an extra semicolon in each
SYSUNINIT() expansion).


# 3234f9c0 10-Nov-1998 Peter Wemm <peter@FreeBSD.org>

New macro for building a linker set of things to do at module unload
time (eg: disconnect malloc types contained within a module), opposite
of SYSINIT().


# aa855a59 15-Oct-1998 Peter Wemm <peter@FreeBSD.org>

*gulp*. Jordan specifically OK'ed this..

This is the bulk of the support for doing kld modules. Two linker_sets
were replaced by SYSINIT()'s. VFS's and exec handlers are self registered.
kld is now a superset of lkm. I have converted most of them, they will
follow as a seperate commit as samples.
This all still works as a static a.out kernel using LKM's.


# 3c1cc396 09-Oct-1998 Peter Wemm <peter@FreeBSD.org>

Add SI_SUB_KLD
First part of support for merging SYSINIT sets.

This, and the following KLD commits have been OK'ed by jkh and msmith
based on my assertion that it works here (barring merge errors :-).


# ecbb00a2 07-Jun-1998 Doug Rabson <dfr@FreeBSD.org>

This commit fixes various 64bit portability problems required for
FreeBSD/alpha. The most significant item is to change the command
argument to ioctl functions from int to u_long. This change brings us
inline with various other BSD versions. Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.


# e95ee6a0 08-May-1998 Bruce Evans <bde@FreeBSD.org>

Fixed overflow in sysinit enum constants. In that little-used
language, ANSI C, enum constants must be representable as ints.
We assumed at-least-33-bit ints. This worked on some 32-bit
systems because we don't mix negative sysinit enum constants with
too-large sysinit enum constants, and the compiler used an unsigned
32-bit type for sysinit enum variables, so sysinit enum variables
were sorted correctly. The fix lops off 4 hopefully-unused bits
so that we now only assume at-least-29-bit ints.


# c1087c13 15-Apr-1998 Bruce Evans <bde@FreeBSD.org>

Support compiling with `gcc -ansi'.


# 00af9731 04-Apr-1998 Poul-Henning Kamp <phk@FreeBSD.org>

Time changes mark 2:

* Figure out UTC relative to boottime. Four new functions provide
time relative to boottime.

* move "runtime" into struct proc. This helps fix the calcru()
problem in SMP.

* kill mono_time.

* add timespec{add|sub|cmp} macros to time.h. (XXX: These may change!)

* nanosleep, select & poll takes long sleeps one day at a time

Reviewed by: bde
Tested by: ache and others


# 8a6472b7 28-Mar-1998 Peter Dufault <dufault@FreeBSD.org>

Finish _POSIX_PRIORITY_SCHEDULING. Needs P1003_1B and
_KPOSIX_PRIORITY_SCHEDULING options to work. Changes:

Change all "posix4" to "p1003_1b". Misnamed files are left
as "posix4" until I'm told if I can simply delete them and add
new ones;

Add _POSIX_PRIORITY_SCHEDULING system calls for FreeBSD and Linux;

Add man pages for _POSIX_PRIORITY_SCHEDULING system calls;

Add options to LINT;

Minor fixes to P1003_1B code during testing.


# 74b2192a 11-Dec-1997 John Dyson <dyson@FreeBSD.org>

We have had support for running the kernel daemons as threads for
quite a while, but forgot to do so. For now, this code supports
most daemons running as kernel threads in UP kernels, and as
full processes in SMP. We will soon be able to run them as
threads in SMP, but not yet.


# c4fa014b 18-Nov-1997 Bruce Evans <bde@FreeBSD.org>

Fixed pedantic syntax errors caused by trailing semicolon in the
__ELF__ case of the definition of MAKE_SET() and in the PSEUDO_LKM
case of the definition of PSEUDO_SET().


# 5957b261 21-Sep-1997 Justin T. Gibbs <gibbs@FreeBSD.org>

buf.h:
Change the definition of a buffer queue so that bufqdisksort can
properly deal with bordered writes.

Add inline functions for accessing buffer queues. This should be
considered an opaque data structure by clients.

callout.h:
New callout implementation.

device.h:
Add support for CAM interrupts.

disk.h:
disklabel.h:
tqdisksort->bufqdisksort

kernel.h:
Add new configuration entries for configuration hooks and calling
cpu_rootconf and cpu_dumpconf.

param.h:
Add a priority for sleeping waiting on config hooks.

proc.h:
Update for new callout implementation.

queue.h:
Add TAILQ_HEAD_INITIALIZER from NetBSD.

systm.h:
Add prototypes for cpu_root/dumpconf, splcam, splsoftcam, etc..


# 5faa3121 24-Jun-1997 John Hay <jhay@FreeBSD.org>

Add tickadj to struct clockinfo, like NetBSD and OpenBSD.
NOTE: libc, time, kgmon and rpc.rstatd will have to be recompiled.


# b3196e4b 22-Jun-1997 Peter Wemm <peter@FreeBSD.org>

Preliminary support for per-cpu data pages.

This eliminates a lot of #ifdef SMP type code. Things like _curproc reside
in a data page that is unique on each cpu, eliminating the expensive macros
like: #define curproc (SMPcurproc[cpunumber()])

There are some unresolved bootstrap and address space sharing issues at
present, but Steve is waiting on this for other work. There is still some
strictly temporary code present that isn't exactly pretty.

This is part of a larger change that has run into some bumps, this part is
standalone so it should be safe. The temporary code goes away when the
full idle cpu support is finished.

Reviewed by: fsmp, dyson


# 85d7ee59 28-May-1997 Peter Wemm <peter@FreeBSD.org>

Don't refer to NCPU in extern decl for SMPruntime[]


# 3677aca0 07-May-1997 Peter Wemm <peter@FreeBSD.org>

remove opt_smp.h
move declaration of SMPruntime[] to here next to the #define and the
uniprocessor counterpart


# 477a642c 26-Apr-1997 Peter Wemm <peter@FreeBSD.org>

Man the liferafts! Here comes the long awaited SMP -> -current merge!

There are various options documented in i386/conf/LINT, there is more to
come over the next few days.

The kernel should run pretty much "as before" without the options to
activate SMP mode.

There are a handful of known "loose ends" that need to be fixed, but
have been put off since the SMP kernel is in a moderately good condition
at the moment.

This commit is the result of the tinkering and testing over the last 14
months by many people. A special thanks to Steve Passe for implementing
the APIC code!


# 9081eec1 22-Apr-1997 John Polstra <jdp@FreeBSD.org>

Make the necessary changes so that an ELF kernel can be built. I
have successfully built, booted, and run a number of different ELF
kernel configurations, including GENERIC. LINT also builds and
links cleanly, though I have not tried to boot it.

The impact on developers is virtually nil, except for two things.
All linker sets that might possibly be present in the kernel must be
listed in "sys/i386/i386/setdefs.h". And all C symbols that are
also referenced from assembly language code must be listed in
"sys/i386/include/asnames.h". It so happens that failure to do
these things will have no impact on the a.out kernel. But it will
break the build of the ELF kernel.

The ELF bootloader works, but it is not ready to commit quite yet.


# 0ddf9be1 06-Apr-1997 Peter Dufault <dufault@FreeBSD.org>

Make MOD_* macros almost consistent:

Use the name argument almost the same in all LKM types. Maintain
the current behavior for the external (e.g., modstat) name for DEV,
EXEC, and MISC types being #name ## "_mod" and SYCALL and VFS only
#name. This is a candidate for change and I vote just the name without
the "_mod".

Change the DISPATCH macro to MOD_DISPATCH for consistency with the
other macros.

Add an LKM_ANON #define to eliminate the magic -1 and associated
signed/unsigned warnings.

Add MOD_PRIVATE to support wcd.c's poking around in the lkm structure.

Change source in tree to use the new interface.

Reviewed by: Bruce Evans


# 774fce94 22-Mar-1997 Bruce Evans <bde@FreeBSD.org>

Removed `volatile' from declaration of `time', and removed the resulting
null casts. `time' is nonvolatile for accesses within a region locked
by splclock()/splx(). Accesses outside such a region are invalid, and
splx() must have the side effect of potentially changing all global
variables (since there are hundreds of sort of volatile variables like
`time'), so declaring `time' as volatile didn't have any real benefits.


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 4dd8ff7e 01-Jan-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Make it possible to test kernel code in a userland harness, even if it
uses MAKE_SET or derivatives and <sys/time.h> at the same time...


# a1971699 20-Sep-1996 Bruce Evans <bde@FreeBSD.org>

Fixed lots of warnings from gcc-2.7.x about "left-hand operand of
comma expression has no effect" in the MAKE_SET() macro. This also
fixes compiling with -O3 (which removes static functions unless
there is a suitable reference to them). Declaring all the static
symbols as __unused would also fix the warning, but would be bogus
(they are used) and wouldn't fix -O3. However, the dummy pointers
for the references waste about 1.5K text and 20K symbol space for
GENERIC. This wastage hasn't changed - the dummy pointers are just
nonzero now.


# e7fa2650 03-Sep-1996 Bruce Evans <bde@FreeBSD.org>

`struct linker_set execsw_set' was declared as const and pointers in it
were declared as non-const. This is backwards (_lkm_exec() changes the
pointers but all the target `struct execsw's are const). Fixed this
and poisoned related declarations to match and removed the bogus casts
that hid the bug.


# cc3d5226 23-Jun-1996 Bruce Evans <bde@FreeBSD.org>

Unstaticize psratio and staticize profprocs. psratio needs to be exported
to trap.c to fix user profiling.


# 6c5e9bbd 30-Jan-1996 Mike Pritchard <mpp@FreeBSD.org>

Fix a bunch of spelling errors in the comment fields of
a bunch of system include files.


# 9b93d9b6 16-Dec-1995 Bruce Evans <bde@FreeBSD.org>

Really finished (?) cleaning up sysinit stuff.


# 53ac6efb 29-Nov-1995 Julian Elischer <julian@FreeBSD.org>

OK, that's it..
That's EVERY SINGLE driver that has an entry in conf.c..
my next trick will be to define cdevsw[] and bdevsw[]
as empty arrays and remove all those DAMNED defines as well..

Each of these drivers has a SYSINIT linker set entry
that comes in very early.. and asks teh driver to add it's own
entry to the two devsw[] tables.

some slight reworking of the commits from yesterday (added the SYSINIT
stuff and some usually wrong but token DEVFS entries to all these
devices.

BTW does anyone know where the 'ata' entries in conf.c actually reside?
seems we don't actually have a 'ataopen() etc...

If you want to add a new device in conf.c
please make sure I know
so I can keep it up to date too..

as before, this is all dependent on #if defined(JREMOD)
(and #ifdef DEVFS in parts)


# a5d3a441 19-Nov-1995 Poul-Henning Kamp <phk@FreeBSD.org>

Close the "unused" warning for things in linker-sets.
This will also allow us to catch typos in the setname by running a
nm through a grep.


# b3e24f9c 14-Nov-1995 Bruce Evans <bde@FreeBSD.org>

Changed the first (name) arg of MOD_DEV(), MOD_EXEC() and MOD_MISC()
from a string to an identifier so that it can be used to generate
declarations and strings. It's much easier to stringize an identifier
than to identifize a string. A uniform naming scheme must be used
for the automatically generated things to apply. This is a feature.

Used the module identifer to generate prototypes for the module load,
unload and stat functions. Removed the few prototypes for these that
already existed.

Used the module identifier to generate a unique struct tag in MOD_DEV().
This should probably be done for all the MOD_*() macros.

Moved the trailing semicolon from the MOD_*() macro definitions to the
macro invocations that didn't already (bogusly) have it.

Staticized the module load and unload functions.

Added function return types for the module load, unload and stat functions.

lkm/ibcs2/ibcs2.c:
Included <sys/sysproto.h> to get everything prototyped.
Cleaned up #includes.

lkm/ibcs2/ipfw.c:
Cleaned up #includes.

lkm/linux/linux.c:
The module name had to change from "linux_emulator" to "linux_mod" to
be automatically generated.
Cleaned up #includes.

lkm/syscons/*/*_saver.c:
Completed delcarations of function pointers.

sys/i386/isa/atapi.c:
The module name had to change from "atapi" to "atapi_mod" to be
automatically generated.

sys/i386/isa/wcd.c:
Used the fixed MOD_DEV(). This module has two devices and expanded the
macro in the source instead of fixing it.
The module names had to change from "wcd" and "rwcd" to "wcd_mod" and
"rwcd_mod" to be automatically generated.

sys/pccard/pcic.c:
The module name had to change from "pcic" to "pcic_mod" to be
automatically generated.


# 0ea34823 13-Nov-1995 Bruce Evans <bde@FreeBSD.org>

Replaced nosys() by lkm_nullcmd(). Always call lkm load/unload/stat
functions instead of skipping the call if the function is nosys().
nosys() returned the wrong value as well as having the wrong type.


# 4590fd3a 09-Sep-1995 David Greenman <dg@FreeBSD.org>

Fixed init functions argument type - caddr_t -> void *. Fixed a couple of
compiler warnings.


# 8af5d536 02-Sep-1995 Julian Elischer <julian@FreeBSD.org>

devfs changes..
changes to allow devices that don't probe (e.g. /dev/mem)
to create devfs entries
this required giving 'configure' its own SYSINIT entry
so we could duck in just before it with a DEVFS init
and some device inits..
my devfs now looks like:
./misc
./misc/speaker
./misc/mem
./misc/kmem
./misc/null
./misc/zero
./misc/io
./misc/console
./misc/pcaudio
./misc/pcaudioctl
./disks
./disks/rfloppy
./disks/rfloppy/fd0.1440
./disks/rfloppy/fd1.1200
./disks/floppy
./disks/floppy/fd0.1440
./disks/floppy/fd1.1200
also some sligt cleanups.. DEVFS needs a lot of work
but I'm getting back to it..


# 6dbc0269 31-Aug-1995 Bruce Evans <bde@FreeBSD.org>

Fix fatal function type mismatches in lkms. lkm init functions recently
gained a dummy argument for compatibility with sysinit functions, but
this arg wasn't passed for lkms outside the kernel.


# 2b14f991 28-Aug-1995 Julian Elischer <julian@FreeBSD.org>

Reviewed by: julian with quick glances by bruce and others
Submitted by: terry (terry lambert)
This is a composite of 3 patch sets submitted by terry.
they are:
New low-level init code that supports loadbal modules better
some cleanups in the namei code to help terry in 16-bit character support
some changes to the mount-root code to make it a little more
modular..

NOTE: mounting root off cdrom or NFS MIGHT be broken as I haven't been able
to test those cases..

certainly mounting root of disk still works just fine..
mfs should work but is untested. (tomorrows task)

The low level init stuff includes a total rewrite of init_main.c
to make it possible for new modules to have an init phase by simply
adding an entry to a TEXT_SET (or is it DATA_SET) list. thus a new module can
be added to the kernel without editing any other files other than the
'files' file.


# 69244b69 20-Mar-1995 Garrett Wollman <wollman@FreeBSD.org>

Support for pseudo-device LKMs. Note that this is restricted to only
one pseudo per module (a restriction which will eventually be lifted) and
isthus not in its final form.


# e1838a91 17-Mar-1995 Garrett Wollman <wollman@FreeBSD.org>

Beginnings of support for loadable pseudo-devices. bsd.kmod.mk support
and Makefiles for the more interesting ones to come on Monday.


# b5e8ce9f 16-Mar-1995 Bruce Evans <bde@FreeBSD.org>

Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'. Fix all the bugs found. There were no serious
ones.


# 222f2718 05-Oct-1994 Garrett Wollman <wollman@FreeBSD.org>

Define a new macro. PSEUDO_SET, to hide TEXT_SET(pseudo_set, foo)
from users. Eventually this will be used for LKM support.


# 63b46ee5 23-Sep-1994 Garrett Wollman <wollman@FreeBSD.org>

Add MIB variable kern.bootfile (R/W) giving the name of the booted kernel.
Kernel variable is kernelname[].


# af9da405 20-Aug-1994 Paul Richards <paul@FreeBSD.org>

Made them all idempotent.
Reviewed by:
Submitted by:


# 3c4dd356 02-Aug-1994 David Greenman <dg@FreeBSD.org>

Added $Id$


# 26f9a767 25-May-1994 Rodney W. Grimes <rgrimes@FreeBSD.org>

The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.

Reviewed by: Rodney W. Grimes
Submitted by: John Dyson and David Greenman


# df8bae1d 24-May-1994 Rodney W. Grimes <rgrimes@FreeBSD.org>

BSD 4.4 Lite Kernel Sources