History log of /freebsd-current/sys/kern/kern_linker.c
Revision Date Author Comments
# 28a59100 05-Mar-2024 Austin Shafer <ashafer@badland.io>

linuxkpi: Provide a non-NULL value for THIS_MODULE

THIS_MODULE is used to differentiate modules on Linux. We currently
completely stub out any Linux struct module usage, but THIS_MODULE
is still used to populate the "owner" fields of various drivers.
Even though we don't actually dereference these "owner" fields they
are still used by drivers to check if devices/dmabufs/etc come
from different modules. For example, during DRM GEM import some
drivers check if the dmabuf's owner matches the dev's owner. If
they match because they are both NULL drivers may incorrectly think
two resources come from the same module.

This adds a general purpose __this_linker_file which will point to
the linker file of the module that uses it. We can then use that
pointer to have a valid value for THIS_MODULE.

Reviewed by: bz, jhb
Differential Revision: https://reviews.freebsd.org/D44306


# 7ef5c19b 31-Mar-2024 Mark Johnston <markj@FreeBSD.org>

kern linker: Don't invoke dtors without having invoked ctors

I have a kernel module which fails to load because of an unrecognized
relocation type. link_elf_load_file() fails before the module's ctors
are invoked and it calls linker_file_unload(), which causes the module's
dtors to be executed, resulting in a kernel panic.

Add a flag to the linker file to ensure that dtors are not invoked if
unloading due to an error prior to ctors being invoked.

At the moment I only implemented this for link_elf_obj.c since
link_elf.c doesn't invoke dtors, but I refactored link_elf.c to make
them more similar.

Fixes: 9e575fadf491 ("link_elf_obj: Invoke fini callbacks")
Reviewed by: zlei, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D44559


# 1c7307cf 26-Mar-2024 Zhenlei Huang <zlei@FreeBSD.org>

kern linker: Make linker_file_add_dependency() void

The only possible return value has been zero since cee9542d51f0.

No functional change intended.

Reviewed by: dfr
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D44507


# 39450eba 26-Mar-2024 Zhenlei Huang <zlei@FreeBSD.org>

kern linker: Do not touch userrefs of the kernel file

A nonzero `userrefs` of a linker file indicates that the file, either
loaded from kldload(2) or preloaded, can be unloaded via kldunload(2).
As for the kernel file, it can be unloaded by the loader but should not
be after initialization.

This change fixes regression from d9ce8a41eac9 which incidentally
increases `userrefs` of the kernel file.

Reviewed by: dfr, dab, jhb
Fixes: d9ce8a41eac9 kern_linker: Handle module-loading failures in preloaded .ko files
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42530


# f43ff3e1 25-Mar-2024 Zhenlei Huang <zlei@FreeBSD.org>

kern linker: Do not unload a module if it has dependants

Despite the name, linker_file_unload() will drop a reference and return
success when the module file has dependants, i.e. it has more than one
reference. When user request to unload such modules then the kernel
should reject unambiguously and immediately.

PR: 274986
Reviewed by: dfr, dab, jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42527


# c21bc6f3 21-Mar-2024 Bojan Novković <bnovkov@FreeBSD.org>

ddb: Add CTF-based pretty printing

Add basic CTF support and a CTF-powered pretty-printer to ddb.

The db_ctf.* files expose a basic interface for fetching type
data for ELF symbols, interacting with the CTF string table,
and translating type identifiers to type data.

The db_pprint.c file uses those interfaces to implement
a pretty-printer for all kernel ELF symbols.
The pretty-printer works with symbol names and arbitrary addresses:
pprint struct thread 0xffffffff8194ad90

Pretty-printing currently only works after the root filesystem
gets mounted because the CTF info is not available during
early boot.

Differential Revision: https://reviews.freebsd.org/D37899
Approved by: markj (mentor)


# ecf710f0 06-Nov-2023 Zhenlei Huang <zlei@FreeBSD.org>

kern linker: Do not retry loading modules on EEXIST

LINKER_LOAD_FILE() calls linker_load_dependencies() which will return
EEXIST in case the module to be loaded has already been compiled into
the kernel. Since the format of the module is now recognized then there
is no need to retry loading with a different linker, otherwise the
userland will get misleading error number ENOEXEC.

PR: 274936
Reviewed by: dfr
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42474


# 110113bc 09-Sep-2023 Zhenlei Huang <zlei@FreeBSD.org>

sysctl(9): Enable vnet sysctl variables to be loader tunable

Complete phase two of 3da1cf1e88f8.

In 3da1cf1e88f8, the meaning of the flag CTLFLAG_TUN is extended to
automatically check if there is a kernel environment variable which
shall initialize the SYSCTL during early boot. It works for all SYSCTL
types both statically and dynamically created ones, except for the
SYSCTLs which belong to VNETs.

This change extends the meaning further, to allow it also works for
the SYSCTLs which belong to VNETs. A typical usage is
```
VNET_DEFINE_STATIC(int, foo) = 0;
SYSCTL_INT(_net, OID_AUTO, foo, CTLFLAG_RWTUN | CTLFLAG_VNET,
&VNET_NAME(foo), 0, "Description of the foo loader tunable");
```

Note that the implementation has a limitation. It behaves the same way
as that of non-vnet loader tunables. That is, after the kernel or modules
being initialized, any changes (e.g. via kenv) to kernel environment
variable will not affect the corresponding vnet variable of subsequently
created VNETs. To overcome it, we can use TUNABLE_XXX_FETCH to fetch
the kernel environment variable into those vnet variables during vnet
constructing.

This change will fix the following SYSCTLs those belong to VNETs and
have CTLFLAG_TUN flag:
```
net.add_addr_allfibs
net.bpf.optimize_writers
net.inet.tcp.fastopen.ccache_buckets
net.link.bridge.inherit_mac
net.link.bridge.ipfw_arp
net.link.bridge.log_stp
net.link.bridge.pfil_bridge
net.link.bridge.pfil_local_phys
net.link.bridge.pfil_member
net.link.bridge.pfil_onlyip
net.link.lagg.default_use_flowid
net.link.lagg.default_use_numa
net.link.lagg.default_flowid_shift
net.link.lagg.lacp.debug
net.link.lagg.lacp.default_strict_mode
```

Although the following vnet SYSCTLs have CTLFLAG_TUN flag, theirs
values are re-fetched via TUNABLE_XXX_FETCH, thus are not affected
by this change.
```
net.inet.ip.reass_hashsize
net.inet.tcp.hostcache.cachelimit
net.inet.tcp.hostcache.hashsize
net.inet.tcp.hostcache.bucketlimit
net.inet.tcp.syncache.bucketlimit
net.inet.tcp.syncache.cachelimit
net.inet.tcp.syncache.hashsize
net.key.spdcache.maxentries
net.key.spdcache.threshold
```

In memoriam: hselasky
Discussed with: hselasky, glebius
Fixes: 3da1cf1e88f8 Extend the meaning of the CTLFLAG_TUN flag ...
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D39638


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# ba8cc6d7 12-Mar-2023 Mateusz Guzik <mjg@FreeBSD.org>

vfs: use __enum_uint8 for vtype and vstate

This whacks hackery around only reading v_type once.

Bump __FreeBSD_version to 1400093


# 53d0b9e4 30-May-2023 Jessica Clarke <jrtc27@FreeBSD.org>

pmc: Provide full path to modules from kernel linker

This unifies the user object and kernel module paths in libpmcstat,
allows modules loaded from non-standard locations (e.g. from a user's
home directory when testing) to be found and, since buffer is what all
the warnings here use (they were never updated when buffer_modules were
added to pick based on where the file was found) has the side-effect of
ensuring the messages are correct.

This includes obsoleting the now-superfluous -k option in pmcstat.

This change breaks the hwpmc ABI and will be followed by a bump to the
pmc major version.

Reviewed by: jhb, jkoshy, mhorne
Differential Revision: https://reviews.freebsd.org/D40048


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# c84c5e00 18-Jul-2022 Mitchell Horne <mhorne@FreeBSD.org>

ddb: annotate some commands with DB_CMD_MEMSAFE

This is not completely exhaustive, but covers a large majority of
commands in the tree.

Reviewed by: markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35583


# 31d1b816 28-May-2022 Dmitry Chagin <dchagin@FreeBSD.org>

sysent: Get rid of bogus sys/sysent.h include.

Where appropriate hide sysent.h under proper condition.

MFC after: 2 weeks


# bb92cd7b 24-Mar-2022 Mateusz Guzik <mjg@FreeBSD.org>

vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd)


# 5a8fceb3 21-Feb-2022 Mitchell Horne <mhorne@FreeBSD.org>

boottrace: trace annotations for startup and shutdown

Add trace events for execution of SYSINITs (both static and dynamically
loaded), and to the various steps in the shutdown/panic/reboot paths.

Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
X-NetApp-PR: #23
Differential Revision: https://reviews.freebsd.org/D30187


# 877eea42 19-Feb-2022 Mitchell Horne <mhorne@FreeBSD.org>

kern_linker.c: sort includes

This is preferred by style(9). Do this ahead of adding another include.

Reviewed by: imp, kevans
MFC after: 3 days
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D30185


# 95c20faf 07-Nov-2021 Konstantin Belousov <kib@FreeBSD.org>

kernel linker: do not read debug symbol tables for non-debug symbols

In particular, this prevents resolving locals from other files.
To access debug symbol tables, add LINKER_LOOKUP_DEBUG_SYMBOL and
LINKER_DEBUG_SYMBOL_VALUES kobj methods, which are allowed to use
any types of present symbols in all tables.

PR: 207898
Reviewed by: emaste, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32878


# 72f66626 16-Nov-2021 Konstantin Belousov <kib@FreeBSD.org>

linker_debug_symbol_values(): use proper linker interface to get debug values

Reported by: markj
Reviewed by: emaste, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32878


# 4f924a78 12-Nov-2021 Konstantin Belousov <kib@FreeBSD.org>

linker_kldload_busy(): allow recursion

Some drivers recursively loads modules by explicit calls to kldload
during initialization, which might occur during kldload.

PR: 259748
Reported and tested by: thj
Reviewed by: markj
Sponsored by: Nvidia networking
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32972


# 7e1d3eef 25-Nov-2021 Mateusz Guzik <mjg@FreeBSD.org>

vfs: remove the unused thread argument from NDINIT*

See b4a58fbf640409a1 ("vfs: remove cn_thread")

Bump __FreeBSD_version to 1400043.


# 9e575fad 29-Jul-2021 Mark Johnston <markj@FreeBSD.org>

link_elf_obj: Invoke fini callbacks

This is required for KASAN: when a module is unloaded, poisoned regions
(e.g., pad areas between global variables) are left as such, so if they
are reused as KLDs are loaded, false positives can arise.

Reported by: pho, Jenkins
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31339


# e266a0f7 20-May-2021 Konstantin Belousov <kib@FreeBSD.org>

kern linker: do not allow more than one kldload and kldunload syscalls simultaneously

kld_sx is dropped e.g. for executing sysinits, which allows user
to initiate kldunload while module is not yet fully initialized.

Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D30456
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# f1f98706 18-Apr-2021 Warner Losh <imp@FreeBSD.org>

Minor style cleanup

We prefer 'while (0)' to 'while(0)' according to grep and stlye(9)'s
space after keyword rule. Remove a few stragglers of the latter.
Many of these usages were inconsistent within the file.

MFC After: 3 days
Sponsored by: Netflix


# 6fed89b1 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

kern: clean up empty lines in .c and .h files


# 694f3fc8 31-May-2020 Peter Wemm <peter@FreeBSD.org>

Clarify which hints file is the source of an error message.

PR: 246688
Submitted by: Ashish Gupta <lrx337@gmail.com>
MFC after: 1 week


# 615a9fb5 19-May-2020 Mark Johnston <markj@FreeBSD.org>

Use the symbolic name for "modmetadata_set".

MFC after: 1 week
Sponsored by: The FreeBSD Foundation


# d2222aa0e 07-Mar-2020 Mateusz Guzik <mjg@FreeBSD.org>

fd: use smr for managing struct pwd

This has a side effect of eliminating filedesc slock/sunlock during path
lookup, which in turn removes contention vs concurrent modifications to the fd
table.

Reviewed by: markj, kib
Differential Revision: https://reviews.freebsd.org/D23889


# 8d03b99b 01-Mar-2020 Mateusz Guzik <mjg@FreeBSD.org>

fd: move vnodes out of filedesc into a dedicated structure

The new structure is copy-on-write. With the assumption that path lookups are
significantly more frequent than chdirs and chrooting this is a win.

This provides stable root and jail root vnodes without the need to reference
them on lookup, which in turn means less work on globally shared structures.
Note this also happens to fix a bug where jail vnode was never referenced,
meaning subsequent access on lookup could run into use-after-free.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D23884


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# fe20aaec 22-Feb-2020 Ryan Libby <rlibby@FreeBSD.org>

sys/kern: quiet -Wwrite-strings

Quiet a variety of Wwrite-strings warnings in sys/kern at low-impact
sites. This patch avoids addressing certain others which would need to
plumb const through structure definitions.

Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D23798


# 3ff65f71 30-Jan-2020 Mateusz Guzik <mjg@FreeBSD.org>

Remove duplicated empty lines from kern/*.c

No functional changes.


# b249ce48 03-Jan-2020 Mateusz Guzik <mjg@FreeBSD.org>

vfs: drop the mostly unused flags argument from VOP_UNLOCK

Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D21427


# b19c9dea 15-Dec-2019 Ian Lepore <ian@FreeBSD.org>

Rewrite arm kernel stack unwind code to work when unwinding through modules.

The arm kernel stack unwinder has apparently never been able to unwind when
the path of execution leads through a kernel module. There was code that
tried to handle modules by looking for the unwind data in them, but it did
so by trying to find symbols which have never existed in arm kernel
modules. That caused the unwind code to panic, and because part of panic
handling calls into the unwind code, that just created a recursion loop.

Locating the unwind data in a loaded module requires accessing the Elf
section headers to find the SHT_ARM_EXIDX section. For preloaded modules
those headers are present in a metadata blob. For dynamically loaded
modules, the headers are present only while the loading is in progress; the
memory is freed once the module is ready to use. For that reason, there is
new code in kern/link_elf.c, wrapped in #ifdef __arm__, to extract the
unwind info while the headers are loaded. The values are saved into new
fields in the linker_file structure which are also conditional on __arm__.

In arm/unwind.c there is new code to locally cache the per-module info
needed to find the unwind tables. The local cache is crafted for lockless
read access, because the unwind code often needs to run in context where
sleeping is not allowed. A large comment block describes the local cache
list, so I won't repeat it all here.


# c2a8682a 28-Nov-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Factor out check for mounted root file system.

Differential Revision: https://reviews.freebsd.org/D22571
PR: 241639
MFC after: 1 week
Sponsored by: Mellanox Technologies


# aa4612d1 25-Nov-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix panic when loading kernel modules before root file system is mounted.
Make sure the rootvnode is always NULL checked.

Differential Revision: https://reviews.freebsd.org/D22545
PR: 241639
MFC after: 1 week
Sponsored by: Mellanox Technologies


# d158fa4a 20-Oct-2018 Conrad Meyer <cem@FreeBSD.org>

Add flags variants to linker_files / stack(9) symbol resolution

Some best-effort consumers may find trylock behavior for stack(9) symbol
resolution acceptable. Expose that behavior to such consumers.

This API is ugly. If in the future the modules and linker file list locking
is cleaned up such that the linker_files list can be iterated safely without
acquiring a sleepable lock, this API should be removed. However, most of
the time nothing will be holding the linker files lock exclusive and the
acquisition can proceed.

Reviewed by: markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D17620


# 0455a92b 17-Oct-2018 Bjoern A. Zeeb <bz@FreeBSD.org>

The countp argument passed to linker_file_lookup_set() in
linker_load_dependencies() is unused, so no need to ask for the
value in first place. Remove the unused "count" variable.

Approved by: re (kib)


# 891cf3ed 18-May-2018 Ed Maste <emaste@FreeBSD.org>

Use NULL for SYSINIT's last arg, which is a pointer type

Sponsored by: The FreeBSD Foundation


# 8a36da99 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/kern: adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# edb01d11 15-Nov-2017 Gordon Tetlow <gordon@FreeBSD.org>

Properly bzero kldstat structure to prevent kernel information leak.

Submitted by: kib
Reported by: TJ Corley
Security: CVE-2017-1088


# 693593b6 04-Oct-2017 Andriy Gapon <avg@FreeBSD.org>

sysctl-s in a module should be accessible only when the module is initialized

A sysctl can have a custom handler that may access data that is initialized
via SYSINIT(9) or via a module event handler (also invoked via SYSINIT).
Thus, it is not safe to allow access to the module's sysctl-s until
the initialization is performed. Likewise, we should not allow access
to teh sysctl-s after the module is uninitialized.
The latter is easy to achieve by properly ordering linker_file_unregister_sysctls
and linker_file_sysuninit.
The former is not as easy for two reasons:
- the initialization may depend on tunables which get set when sysctl-s are
registered, so we need to set the tunables before running sysinit-s
- the initialization may try to dynamically add more sysctl-s under statically
defined sysctl nodes
So, this change splits the sysctl setup into two phases. In the first phase
the sysctl-s are registered as before but they are disabled and hidden from
consumers. In the second phase, done after sysinit-s, normal access to the
sysctl-s is enabled.

The change should affect only dynamic module loading and unloading after
the system boot-up. Nothing changes for sysctl-s compiled into the kernel
and sysctl-s in preloaded modules.

Discussed with: hselasky, ian, jhb
Reviewed by: julian, kib
MFC after: 2 weeks
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D12545


# 550374ef 01-Oct-2017 Andriy Gapon <avg@FreeBSD.org>

revert r324166, it has an unrelated change in it


# ca3fec50 29-Jul-2017 Conrad Meyer <cem@FreeBSD.org>

kldstat: Use sizeof in place of named constants for sizing

No functional change.

This is handy for FreeBSD derivatives that want to modify the value of
MAXPATHLEN, but not the kld_file_stat ABI.

Submitted by: Siddhant Agarwal <sagarwal AT isilon.com>
Sponsored by: Dell EMC Isilon


# d0147e10 08-Mar-2017 Gleb Smirnoff <glebius@FreeBSD.org>

In linker_load_file() print name of a file that failed to load.

Discussed with: kib


# d9ce8a41 12-Oct-2016 Conrad Meyer <cem@FreeBSD.org>

kern_linker: Handle module-loading failures in preloaded .ko files

The runtime kernel loader, linker_load_file, unloads kernel files that
failed to load all of their modules. For consistency, treat preloaded
(loader.conf loaded) kernel files in the same way.

Reviewed by: kib
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D8200


# 8a3aeac2 09-Jun-2016 Conrad Meyer <cem@FreeBSD.org>

Add DDB command "kldstat"

It prints much the same information as kldstat(8) without any arguments.

Suggested by: jhibbits
Sponsored by: EMC / Isilon Storage Division


# e3043798 29-Apr-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/kern: spelling fixes in comments.

No functional change.


# d9c9c81c 21-Apr-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: use our roundup2/rounddown2() macros when param.h is available.

rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.

This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.


# 35030a5d 29-Mar-2016 Edward Tomasz Napierala <trasz@FreeBSD.org>

Remove some NULL checks for M_WAITOK allocations.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# da82615a 10-Dec-2015 Warner Losh <imp@FreeBSD.org>

Create the MDT_PNP_INFO metadata record to communicate PNP info about
modules. External agents may use this data to automatically load those
modules.

Differential Review: https://reviews.freebsd.org/D3461


# 645743ea 12-Nov-2015 John Baldwin <jhb@FreeBSD.org>

Export various helper variables describing the layout and size of
certain kernel structures for use by debuggers. This mostly aids
in examining cores from a kernel without debug symbols as a debugger
can infer these values if debug symbols are available.

One set of variables describes the layout of 'struct linker_file' to
walk the list of loaded kernel modules.

A second set of variables describes the layout of 'struct proc' and
'struct thread' to walk the list of processes in the kernel and the
threads in each process.

The 'pcb_size' variable is used to index into the stoppcbs[] array.

The 'vm_maxuser_address' is used to distinguish kernel virtual addresses
from user addresses. This doesn't have to be perfect, and
'vm_maxuser_address' is a cheap and simple way to differentiate kernel
pointers from simple values like TIDs and PIDs.

While here, annotate the fields in struct pcb used by kgdb on amd64
and i386 to note that their ABI should be preserved. Annotations for
other platforms will be added in the future.

Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D3773


# 7665e341 15-Sep-2015 Mateusz Guzik <mjg@FreeBSD.org>

sysctl: switch sysctllock to a sleepable rmlock, take 2

This restores r285125. Previous attempt was reverted due to a bug in rmlocks,
which is fixed since r287833.


# 4ae1e3c7 30-Jul-2015 Mateusz Guzik <mjg@FreeBSD.org>

Revert r285125 until rmlocks get fixed.

Right now there is a chance that sysctl unregister will cause reader to
block on the sx lock associated with sysctl rmlock, in which case kernels
with debug enabled will panic.


# b8633775 04-Jul-2015 Mateusz Guzik <mjg@FreeBSD.org>

sysctl: switch sysctllock to a sleepable rmlock

The lock is almost never taken for writing.


# 15c2b301 08-Jun-2015 John Baldwin <jhb@FreeBSD.org>

Revert r284153, as I believe it breaks the dtrace sdt module. I will
fix the original issue a different way.


# 69c5c774 08-Jun-2015 John Baldwin <jhb@FreeBSD.org>

Add an internal "locked" variant of linker_file_lookup_set() and change
the public function to acquire the global linker lock directly. This
permits linker_file_lookup_set() to be safely used from other modules.


# d0b6da08 03-Dec-2014 Warner Losh <imp@FreeBSD.org>

Const poison in a few places to ensure we don't modify things
through the module data pointer.


# fac92ae1 29-Nov-2014 Warner Losh <imp@FreeBSD.org>

The current limit of 100k for the linker hints file is getting a bit
crowded as we now are at about 70k. Bump the limit to 1MB instead
which is still quite a reasonable limit and allows for future growth
of this file and possible future expansion to additional data.

MFC After: 2 weeks


# 2afec8ed 21-Oct-2014 Mateusz Guzik <mjg@FreeBSD.org>

Take the lock shared in linker_search_symbol_name.

This helps sysctl kern.proc.stack.


# 580a0117 21-Oct-2014 Mateusz Guzik <mjg@FreeBSD.org>

Rename sysctl_lock and _unlock to sysctl_xlock and _xunlock.


# 0067051f 20-Oct-2014 Marcel Moolenaar <marcel@FreeBSD.org>

Fully support constructors for the purpose of code coverage analysis.
This involves:
1. Have the loader pass the start and size of the .ctors section to the
kernel in 2 new metadata elements.
2. Have the linker backends look for and record the start and size of
the .ctors section in dynamically loaded modules.
3. Have the linker backends call the constructors as part of the final
work of initializing preloaded or dynamically loaded modules.

Note that LLVM appends the priority of the constructors to the name of
the .ctors section. Not so when compiling with GCC. The code currently
works for GCC and not for LLVM.

Submitted by: Dmitry Mikulin <dmitrym@juniper.net>
Obtained from: Juniper Networks, Inc.


# af3b2549 27-Jun-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Pull in r267961 and r267973 again. Fix for issues reported will follow.


# 37a107a4 27-Jun-2014 Glen Barber <gjb@FreeBSD.org>

Revert r267961, r267973:

These changes prevent sysctl(8) from returning proper output,
such as:

1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory


# 3da1cf1e 27-Jun-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after: 2 weeks
Sponsored by: Mellanox Technologies


# 14fcb4b4 05-Apr-2014 Konstantin Belousov <kib@FreeBSD.org>

Use realloc(9) instead of doing the reallocation inline.

Submitted by: bde
MFC after: 1 week


# cee9542d 12-Mar-2014 Konstantin Belousov <kib@FreeBSD.org>

Use correct types for sizeof() in the calculations for the malloc(9) sizes [1].
While there, remove unneeded checks for failed allocations with M_WAITOK flag.

Submitted by: Conrad Meyer <cemeyer@uw.edu> [1]
MFC after: 1 week


# 8f725462 18-Dec-2013 Mark Johnston <markj@FreeBSD.org>

Invoke the kld_* event handlers from linker_load_file() and
linker_unload_file() rather than kern_kldload() and kern_kldunload(). This
ensures that the handlers are invoked for files that are loaded/unloaded
automatically as dependencies. Previously, they were only invoked for files
loaded by a user.

As a side effect, the kld_load and kld_unload handlers are now invoked with
the kernel linker lock exclusively held.

Reported by: avg
Reviewed by: jhb
MFC after: 2 weeks


# 29f4e216 24-Aug-2013 Mark Johnston <markj@FreeBSD.org>

Rename the kld_unload event handler to kld_unload_try, and add a new
kld_unload event handler which gets invoked after a linker file has been
successfully unloaded. The kld_unload and kld_load event handlers are now
invoked with the shared linker lock held, while kld_unload_try is invoked
with the lock exclusively held.

Convert hwpmc(4) to use these event handlers instead of having
kern_kldload() and kern_kldunload() invoke hwpmc(4) hooks whenever files are
loaded or unloaded. This has no functional effect, but simplifes the linker
code somewhat.

Reviewed by: jhb


# 0770f164 24-Aug-2013 Mark Johnston <markj@FreeBSD.org>

Set things up so that linker_file_lookup_set() is always called with the
linker lock held. This makes it possible to call it from a kld event handler
with the shared lock held.

Reviewed by: jhb


# 3a277424 24-Aug-2013 Mark Johnston <markj@FreeBSD.org>

Remove the kld lock macros and just use the sx(9) API. Add locking in
linker_init_kernel_modules() and linker_preload() in order to remove most
of the checks for !cold before asserting that the kld lock is held. These
routines are invoked by SYSINIT(9), so there's no harm in them taking the
kld lock.


# 196f2f42 15-Aug-2013 Mark Johnston <markj@FreeBSD.org>

Use strdup(9) instead of reimplementing it.


# 12ede07a 13-Aug-2013 Mark Johnston <markj@FreeBSD.org>

Use kld_{load,unload} instead of mod_{load,unload} for the linker file load
and unload event handlers added in r254266.

Reported by: jhb
X-MFC with: r254266


# 8776669b 12-Aug-2013 Mark Johnston <markj@FreeBSD.org>

FreeBSD's DTrace implementation has a few problems with respect to handling
probes declared in a kernel module when that module is unloaded. In
particular,

* Unloading a module with active SDT probes will cause a panic. [1]
* A module's (FBT/SDT) probes aren't destroyed when the module is unloaded;
trying to use them after the fact will generally cause a panic.

This change fixes both problems by porting the DTrace module load/unload
handlers from illumos and registering them with the corresponding
EVENTHANDLER(9) handlers. This allows the DTrace framework to destroy all
probes defined in a module when that module is unloaded, and to prevent a
module unload from proceeding if some of its probes are active. The latter
problem has already been fixed for FBT probes by checking lf->nenabled in
kern_kldunload(), but moving the check into the DTrace framework generalizes
it to all kernel providers and also fixes a race in the current
implementation (since a probe may be activated between the check and the
call to linker_file_unload()).

Additionally, the SDT implementation has been reworked to define SDT
providers/probes/argtypes in linker sets rather than using SYSINIT/SYSUNINIT
to create and destroy SDT probes when a module is loaded or unloaded. This
simplifies things quite a bit since it means that pretty much all of the SDT
code can live in sdt.ko, and since it becomes easier to integrate SDT with
the DTrace framework. Furthermore, this allows FreeBSD to be quite flexible
in that SDT providers spanning multiple modules can be created on the fly
when a module is loaded; at the moment it looks like illumos' SDT
implementation requires all SDT probes to be statically defined in a single
kernel table.

PR: 166927, 166926, 166928
Reported by: davide [1]
Reviewed by: avg, trociny (earlier version)
MFC after: 1 month


# 9c6139e4 12-Aug-2013 Mark Johnston <markj@FreeBSD.org>

Remove some unused fields from struct linker_file. They were added in
r172862 for use by the DTrace SDT framework but don't seem to have ever
been used.

MFC after: 2 weeks


# c9b645b5 12-Aug-2013 Mark Johnston <markj@FreeBSD.org>

Add event handlers for module load and unload events. The load handlers are
called after the module has been loaded, and the unload handlers are called
before the module is unloaded. Moreover, the module unload handlers may
return an error to prevent the unload from proceeding.

Reviewed by: avg
MFC after: 2 weeks


# 5050aa86 22-Oct-2012 Konstantin Belousov <kib@FreeBSD.org>

Remove the support for using non-mpsafe filesystem modules.

In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.

The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.

Conducted and reviewed by: attilio
Tested by: pho


# 7582954e 12-Apr-2012 John Baldwin <jhb@FreeBSD.org>

If a linker file contains at least one module, but all of the modules
fail to load (the MOD_LOAD event fails) during a kldload(2), unload the
linker file and fail the kldload(2) with ENOEXEC.

Reported by: gcooper
MFC after: 1 week


# 5b0da85a 22-Mar-2012 Andrey V. Elsukov <ae@FreeBSD.org>

Correct debug message.


# c5e7f064 21-Mar-2012 Andrey V. Elsukov <ae@FreeBSD.org>

Acquire modules lock before call module_getname() in the KLD_DEBUG case.

MFC after: 1 week


# b26a0984 15-Mar-2012 Andrey V. Elsukov <ae@FreeBSD.org>

Add CTLFLAG_TUN to the sysctl definition and fix style.

Pointed by: Garrett Cooper
MFC after: 2 weeks


# 199aa975 14-Mar-2012 Andrey V. Elsukov <ae@FreeBSD.org>

Add debug.kld_debug loader tunable.

MFC after: 2 weeks


# 526d0bd5 20-Feb-2012 Konstantin Belousov <kib@FreeBSD.org>

Fix found places where uio_resid is truncated to int.

Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the
sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from
the usermode.

Discussed with: bde, das (previous versions)
MFC after: 1 month


# fcdd3d32 20-Feb-2012 Xin LI <delphij@FreeBSD.org>

Revert r231923 for now. Further work is needed to make sure that the
behavior is consistent.


# 5bfbb598 19-Feb-2012 Xin LI <delphij@FreeBSD.org>

Use uprintf instead of printf for the reason why a kernel module can not
be loaded. This way, the administrator can get response immediately from
the shell session rather than relying on dmesg.

MFC after: 1 month


# dc15eac0 01-Jan-2012 Ed Schouten <ed@FreeBSD.org>

Use strchr() and strrchr().

It seems strchr() and strrchr() are used more often than index() and
rindex(). Therefore, simply migrate all kernel code to use it.

For the XFS code, remove an empty line to make the code identical to
the code in the Linux kernel.


# 4e313b69 06-Nov-2011 Max Khon <fjoe@FreeBSD.org>

Add KLD_DEBUG option.


# 8451d0dd 16-Sep-2011 Kip Macy <kmacy@FreeBSD.org>

In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by: rwatson
Approved by: re (bz)


# 062393fe 31-Jul-2011 Gleb Smirnoff <glebius@FreeBSD.org>

Don't leak kld_sx lock in kldunloadf().

Approved by: re (kib)


# 05a4755e 17-Jul-2011 Ryan Stone <rstone@FreeBSD.org>

Fix a LOR between hwpmc and the kernel linker. When a system-wide
sampling mode PMC is allocated, hwpmc calls linker_hwpmc_list_objects()
while already holding an exclusive lock on pmc-sx lock. list_objects()
tries to acquire an exclusive lock on the kld_sx lock. When a KLD module
is loaded or unloaded successfully, kern_kld(un)load calls into the pmc
hook while already holding an exclusive lock on the kld_sx lock. Calling
the pmc hook requires acquiring a shared lock on the pmc-sx lock.

Fix this by only acquiring a shared lock on the kld_sx lock in
linker_hwpmc_list_objects(), and also downgrading to a shared lock on the
kld_sx lock in kern_kld(un)load before calling into the pmc hook. In
kern_kldload this required moving some modifications of the linker_file_t
to happen before calling into the pmc hook.

This fixes the deadlock by ensuring that the hwpmc -> list_objects() case
is always able to proceed. Without this patch, I was able to deadlock a
multicore system within minutes by constantly loading and unloading an KLD
module while I simultaneously started a sampling mode PMC in a loop.

MFC after: 1 month


# 86665509 30-Mar-2011 Konstantin Belousov <kib@FreeBSD.org>

Provide compat32 shims for kldstat(2).

Requested and tested by: jpaetzel
MFC after: 1 week


# 2fee06f0 18-Jan-2011 Matthew D Fleming <mdf@FreeBSD.org>

Specify a CTLTYPE_FOO so that a future sysctl(8) change does not need
to rely on the format string.


# 7b956487 05-Jan-2011 Edward Tomasz Napierala <trasz@FreeBSD.org>

Fix page fault that occurred when trying to initialize preloaded kernel module,
the dependency of which was preloaded, but failed to initialize. Previously,
kernel dereferenced NULL pointer returned by modlist_lookup2(); now, when this
happens, we unload the dependent module. Since the depended_files list is
sorted in dependency order, this properly propagates, unloading modules that
depend on failed ones.

From the user point of view, this prevents the kernel from panicing when
trying to boot kernel compiled without KDTRACE_HOOKS with dtraceall_load="YES"
in /boot/loader.conf.

Reviewed by: kib


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 61548876 22-Sep-2010 Andriy Gapon <avg@FreeBSD.org>

kdb_backtrace: use stack_print_ddb instead of stack_print

This is a followup to r212964.
stack_print call chain obtains linker sx lock and thus potentially may
lead to a deadlock depending on a kind of a panic.
stack_print_ddb doesn't acquire any locks and it doesn't use any
facilities of ddb backend.
Using stack_print_ddb outside of DDB ifdef required taking a number of
helper functions from under it as well.

It is a good idea to rename linker_ddb_* and stack_*_ddb functions to
have 'unlocked' component in their name instead of 'ddb', because those
functions do not use any DDB services, but instead they provide unlocked
access to linker symbol information. The latter was previously needed
only for DDB, hence the 'ddb' name component.

Alternative is to ditch unlocked versions altogether after implementing
proper panic handling:
1. stop other cpus upon a panic
2. make all non-spinlock lock operations (mutex, sx, rwlock) be a no-op
when panicstr != NULL

Suggested by: mdf
Discussed with: attilio
MFC after: 2 weeks


# 3ea6157e 17-Nov-2009 Oleksandr Tymoshenko <gonzo@FreeBSD.org>

- Unbreak build with KLD_DEBUG defined
- Add debug.kld_debug sysctl to control KLD debugging level
- Print information about KLD dependencies with debug enabled


# 67784314 08-Sep-2009 Poul-Henning Kamp <phk@FreeBSD.org>

Revert previous commit and add myself to the list of people who should
know better than to commit with a cat in the area.


# b34421bf 08-Sep-2009 Poul-Henning Kamp <phk@FreeBSD.org>

Add necessary include.


# 530c0060 01-Aug-2009 Robert Watson <rwatson@FreeBSD.org>

Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by: bz
Approved by: re (vimage blanket)


# 5b4d5f9e 21-Jul-2009 Rui Paulo <rpaulo@FreeBSD.org>

Improve the printf message when a module failed to load. This gives the
user some clue about the possibility of a __FreeBSD_version mismatch.

Discussed with: rwatson, jhb
Approved by: re (kib)


# 7afcbc18 17-Jul-2009 Jamie Gritton <jamie@FreeBSD.org>

Remove the interim vimage containers, struct vimage and struct procg,
and the ioctl-based interface that supported them.

Approved by: re (kib), bz (mentor)


# eddfbb76 14-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)


# 5257ff7e 29-Jun-2009 Attilio Rao <attilio@FreeBSD.org>

Don't assume a default (currently 15) value for preloaded klds when
loading hwpmc, but calculate at runtime and allocate the necessary space.
Also the current logic is wrong as it can lead to an endless loop.

Sponsored by: Sandvine Incorporated
Reported by: Ryan Stone <rstone at sandvine dot com>
Tested by: Giovanni Trematerra
<giovanni dot trematerra at gmail dot com>
Approved by: re (kib)


# bcf11e8d 05-Jun-2009 Robert Watson <rwatson@FreeBSD.org>

Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.

Discussed with: pjd


# 0304c731 27-May-2009 Jamie Gritton <jamie@FreeBSD.org>

Add hierarchical jails. A jail may further virtualize its environment
by creating a child jail, which is visible to that jail and to any
parent jails. Child jails may be restricted more than their parents,
but never less. Jail names reflect this hierarchy, being MIB-style
dot-separated strings.

Every thread now points to a jail, the default being prison0, which
contains information about the physical system. Prison0's root
directory is the same as rootvnode; its hostname is the same as the
global hostname, and its securelevel replaces the global securelevel.
Note that the variable "securelevel" has actually gone away, which
should not cause any problems for code that properly uses
securelevel_gt() and securelevel_ge().

Some jail-related permissions that were kept in global variables and
set via sysctls are now per-jail settings. The sysctls still exist for
backward compatibility, used only by the now-deprecated jail(2) system
call.

Approved by: bz (mentor)


# 2114e063 08-May-2009 Marko Zec <zec@FreeBSD.org>

A NOP change: style / whitespace cleanup of the noise that slipped
into r191816.

Spotted by: bz
Approved by: julian (mentor) (an earlier version of the diff)


# 29b02909 08-May-2009 Marko Zec <zec@FreeBSD.org>

Introduce a new virtualization container, provisionally named vprocg, to hold
virtualized instances of hostname and domainname, as well as a new top-level
virtualization struct vimage, which holds pointers to struct vnet and struct
vprocg. Struct vprocg is likely to become replaced in the near future with
a new jail management API import.

As a consequence of this change, change struct ucred to point to a struct
vimage, instead of directly pointing to a vnet.

Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage
branch.

Permit kldload / kldunload operations to be executed only from the default
vimage context.

This change should have no functional impact on nooptions VIMAGE kernel
builds.

Reviewed by: bz
Approved by: julian (mentor)


# 21ca7b57 05-May-2009 Marko Zec <zec@FreeBSD.org>

Change the curvnet variable from a global const struct vnet *,
previously always pointing to the default vnet context, to a
dynamically changing thread-local one. The currvnet context
should be set on entry to networking code via CURVNET_SET() macros,
and reverted to previous state via CURVNET_RESTORE(). Recursions
on curvnet are permitted, though strongly discuouraged.

This change should have no functional impact on nooptions VIMAGE
kernel builds, where CURVNET_* macros expand to whitespace.

The curthread->td_vnet (aka curvnet) variable's purpose is to be an
indicator of the vnet context in which the current network-related
operation takes place, in case we cannot deduce the current vnet
context from any other source, such as by looking at mbuf's
m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so
far curvnet has turned out to be an invaluable consistency checking
aid: it helps to catch cases when sockets, ifnets or any other
vnet-aware structures may have leaked from one vnet to another.

The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros
was a result of an empirical iterative process, whith an aim to
reduce recursions on CURVNET_SET() to a minimum, while still reducing
the scope of CURVNET_SET() to networking only operations - the
alternative would be calling CURVNET_SET() on each system call entry.
In general, curvnet has to be set in three typicall cases: when
processing socket-related requests from userspace or from within the
kernel; when processing inbound traffic flowing from device drivers
to upper layers of the networking stack, and when executing
timer-driven networking functions.

This change also introduces a DDB subcommand to show the list of all
vnet instances.

Approved by: julian (mentor)


# a1d7ce03 10-Feb-2009 Attilio Rao <attilio@FreeBSD.org>

Scanning all the formats for binary translation of modules loading can
result in errors for a format loading but subsequent correct recognizing
for another format.

File format loading functions should avoid printing any additional
informations but just returning appropriate (and different between each
other) error condition, characterizing different informations.
Additively, the linker should handle appropriately different format
loading errors.

While a general mechanism is desired, fix a simple and common case on
amd64: file type is not recognized for link elf and confuses the linker.
Printout an error if all the registered linker classes can't recognize
and load the module.

Reviewed by: jhb
Sponsored by: Sandvine Incorporated


# 875b66a0 06-Feb-2009 John Baldwin <jhb@FreeBSD.org>

Expand the scope of the sysctllock sx lock to protect the sysctl tree itself.
Back in 1.1 of kern_sysctl.c the sysctl() routine wired the "old" userland
buffer for most sysctls (everything except kern.vnode.*). I think to prevent
issues with wiring too much memory it used a 'memlock' to serialize all
sysctl(2) invocations, meaning that only one user buffer could be wired at
a time. In 5.0 the 'memlock' was converted to an sx lock and renamed to
'sysctl lock'. However, it still only served the purpose of serializing
sysctls to avoid wiring too much memory and didn't actually protect the
sysctl tree as its name suggested. These changes expand the lock to actually
protect the tree.

Later on in 5.0, sysctl was changed to not wire buffers for requests by
default (sysctl_handle_opaque() will still wire buffers larger than a single
page, however). As a result, user buffers are no longer wired as often.
However, many sysctl handlers still wire user buffers, so it is still
desirable to serialize userland sysctl requests. Kernel sysctl requests
are allowed to run in parallel, however.

- Expose sysctl_lock()/sysctl_unlock() routines to exclusively lock the
sysctl tree for a few places outside of kern_sysctl.c that manipulate
the sysctl tree directly including the kernel linker and vfs_register().
- sysctl_register() and sysctl_unregister() require the caller to lock
the sysctl lock using sysctl_lock() and sysctl_unlock(). The rest of
the public sysctl API manage the locking internally.
- Add a locked variant of sysctl_remove_oid() for internal use so that
external uses of the API do not need to be aware of locking requirements.
- The kernel linker no longer needs Giant when manipulating the sysctl
tree.
- Add a missing break to the loop in vfs_register() so that we stop looking
at the sysctl MIB once we have changed it.

MFC after: 1 month


# e4d9b9eb1 05-Feb-2009 John Baldwin <jhb@FreeBSD.org>

Drop the kernel linker lock while running SYSUNINIT routines and removing
sysctls during a linker file unload. We drop the lock when doing similar
operations during a linker file load. To close races, clear the LINKED
flag before dropping the lock so that the linker file is no longer visible
to userland.

MFC after: 1 week


# 385195c0 10-Dec-2008 Marko Zec <zec@FreeBSD.org>

Conditionally compile out V_ globals while instantiating the appropriate
container structures, depending on VIMAGE_GLOBALS compile time option.

Make VIMAGE_GLOBALS a new compile-time option, which by default will not
be defined, resulting in instatiations of global variables selected for
V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be
effectively compiled out. Instantiate new global container structures
to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0,
vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.

Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_
macros resolve either to the original globals, or to fields inside
container structures, i.e. effectively

#ifdef VIMAGE_GLOBALS
#define V_rt_tables rt_tables
#else
#define V_rt_tables vnet_net_0._rt_tables
#endif

Update SYSCTL_V_*() macros to operate either on globals or on fields
inside container structs.

Extend the internal kldsym() lookups with the ability to resolve
selected fields inside the virtualization container structs. This
applies only to the fields which are explicitly registered for kldsym()
visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently
this is done only in sys/net/if.c.

Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code,
and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in
turn result in proper code being generated depending on VIMAGE_GLOBALS.

De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c
which were prematurely V_irtualized by automated V_ prepending scripts
during earlier merging steps. PF virtualization will be done
separately, most probably after next PF import.

Convert a few variable initializations at instantiation to
initialization in init functions, most notably in ipfw. Also convert
TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in
initializer functions.

Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# b4824b48 05-Dec-2008 John Baldwin <jhb@FreeBSD.org>

- Invoke MOD_QUIESCE on all modules in a linker file (kld) before
unloading any modules. As a result, if any module veto's an unload
request via MOD_QUIESCE, the entire set of modules for that linker
file will remain loaded and active now rather than leaving the kld
in a weird state where some modules are loaded and some are unloaded.
- This also moves the logic for handling the "forced" unload flag out of
kern_module.c and into kern_linker.c which is a bit cleaner.
- Add a module_name() routine that returns the name of a module and use that
instead of printing pointer values in debug messages when a module fails
MOD_QUIESCE or MOD_UNLOAD.

MFC after: 1 month


# e11e3f18 23-Oct-2008 Dag-Erling Smørgrav <des@FreeBSD.org>

Fix a number of style issues in the MALLOC / FREE commit. I've tried to
be careful not to fix anything that was already broken; the NFSv4 code is
particularly bad in this respect.


# 1ede983c 23-Oct-2008 Dag-Erling Smørgrav <des@FreeBSD.org>

Retire the MALLOC and FREE macros. They are an abomination unto style(9).

MFC after: 3 months


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 0359a12e 28-Aug-2008 Attilio Rao <attilio@FreeBSD.org>

Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>


# 4b3d6093 23-May-2008 John Birrell <jb@FreeBSD.org>

Add the ctf_get function and update the args to linker_file_function_listall.


# d90d4eb2 21-Apr-2008 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Back-out previous revision. For now I can use _ddb() variants of stack(9) KPI,
as I use it for debugging only. Once someone will need it for more production
features, the change should be reconsider.

Requested by: rwatson


# f55f27f8 17-Apr-2008 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Allow linker_search_symbol_name() to be called with KLD lock held.
The linker_search_symbol_name() function is used by stack_print()
and stack_print() can be called from kernel module unload method.

MFC after: 1 week


# 237fdd78 16-Mar-2008 Robert Watson <rwatson@FreeBSD.org>

In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation. This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after: 1 month
Discussed with: imp, rink


# 22db15c0 13-Jan-2008 Attilio Rao <attilio@FreeBSD.org>

VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>


# cdd475b3 01-Dec-2007 Robert Watson <rwatson@FreeBSD.org>

The kernel linker includes a number of utility functions to look up symbol
information in support of DDB(4); these functions bypass normal linker
locking as they may run in contexts where locking is unsafe (such as the
kernel debugger).

Add a new interface linker_ddb_search_symbol_name(), which looks up a
symbol name and offset given an address, and also
linker_search_symbol_name() which does the same but *does* follow the
locking conventions of the linker.

Unlike existing functions, these functions place the name in a
caller-provided buffer, which is stable even after linker locks have been
released. These functions will be used in upcoming revisions to stack(9)
to support kernel stack trace generation in contexts as part of a live,
rather than suspended, kernel.


# f6c15301 17-Nov-2007 John Birrell <jb@FreeBSD.org>

Add a function to list symbols in a file and their values at the
same time rather than having to list the symbols and then go back
and look each one up by name.


# 30d239bc 24-Oct-2007 Robert Watson <rwatson@FreeBSD.org>

Merge first in a series of TrustedBSD MAC Framework KPI changes
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:

mac_<object>_<method/action>
mac_<object>_check_<method/action>

The previous naming scheme was inconsistent and mostly
reversed from the new scheme. Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier. Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods. Also simplify, slightly,
some entry point names.

All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.

Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer


# 1676805c 21-Oct-2007 John Birrell <jb@FreeBSD.org>

Add the full module path name to the kld_file_stat structure
for kldstat(2).

This allows libdtrace to determine the exact file from which
a kernel module was loaded without having to guess.

The kldstat(2) API is versioned with the size of the
kld_file_stat structure, so this change creates version 2.

Add the pathname to the verbose output of kldstat(8) too.

MFC: 3 days


# 9e223287 31-May-2007 Konstantin Belousov <kib@FreeBSD.org>

Revert UF_OPENING workaround for CURRENT.
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation
argument from being file descriptor index into the pointer to struct file.

Proposed and reviewed by: jhb
Reviewed by: daichi (unionfs)
Approved by: re (kensmith)


# c14d15ae 22-Apr-2007 Robert Watson <rwatson@FreeBSD.org>

Remove MAC Framework access control check entry points made redundant with
the introduction of priv(9) and MAC Framework entry points for privilege
checking/granting. These entry points exactly aligned with privileges and
provided no additional security context:

- mac_check_sysarch_ioperm()
- mac_check_kld_unload()
- mac_check_settime()
- mac_check_system_nfsd()

Add mpo_priv_check() implementations to Biba and LOMAC policies, which,
for each privilege, determine if they can be granted to processes
considered unprivileged by those two policies. These mostly, but not
entirely, align with the set of privileges granted in jails.

Obtained from: TrustedBSD Project


# 0c14ff0e 04-Mar-2007 Robert Watson <rwatson@FreeBSD.org>

Remove 'MPSAFE' annotations from the comments above most system calls: all
system calls now enter without Giant held, and then in some cases, acquire
Giant explicitly.

Remove a number of other MPSAFE annotations in the credential code and
tweak one or two other adjacent comments.


# 4a0f58d2 26-Feb-2007 John Baldwin <jhb@FreeBSD.org>

Fix a comment.


# 498eccc9 23-Feb-2007 John Baldwin <jhb@FreeBSD.org>

Drop the global kernel linker lock while executing the sysinit's for a
freshly-loaded kernel module. To avoid various unload races, hide linker
files whose sysinit's are being run from userland so that they can't be
kldunloaded until after all the sysinit's have finished.

Tested by: gallatin


# acd3428b 06-Nov-2006 Robert Watson <rwatson@FreeBSD.org>

Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges. These may
require some future tweaking.

Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>


# aed55708 22-Oct-2006 Robert Watson <rwatson@FreeBSD.org>

Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from: TrustedBSD Project
Sponsored by: SPARTA


# 0f8e0c3d 10-Jul-2006 John Baldwin <jhb@FreeBSD.org>

Explicitly use STAILQ_REMOVE_HEAD() when we know we are removing the head
element to avoid confusing Coverity. It's now also easier for humans to
parse as well.

Found by: Coverity Prevent(tm)
CID: 1201


# 0bf8969c 10-Jul-2006 John Baldwin <jhb@FreeBSD.org>

Fix two more instances of using a linker_file_t object in TAILQ() macros
after free'ing it.

Found by: Coverity Prevent(tm)
CID: 1435


# 6b5b470a 10-Jul-2006 John Baldwin <jhb@FreeBSD.org>

Don't try to reuse the linker_file structure after we've freed it when
throwing out the kld's loaded by the loader that didn't successfully link.

Found by: Coverity Prevent(tm)
CID: 1435


# 398c993b 06-Jul-2006 John Baldwin <jhb@FreeBSD.org>

- Explicitly acquire Giant around SYSINIT's and SYSUNINIT's since they are
not all known to be MPSAFE yet.
- Actually remove Giant from the kernel linker by taking it out of the
KLD_LOCK() and KLD_UNLOCK() macros.

Pointy hat to: jhb (2)


# 70f37788 21-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Replace the kld_mtx mutex with a kld_sx sx lock and expand it's scope to
protect all linker-related data structures including the contents of
linker file objects and the any linker class data as well. Considering how
rarely the linker is used I just went with the simple solution of
single-threading the whole thing rather than expending a lot of effor on
something more fine-grained and complex. Giant is still explicitly
acquired while registering and deregistering sysctl's as well as in the
elf linker class while calling kmupetext(). The rest of the linker runs
without Giant unless it has to acquire Giant while loading files from a
non-MPSAFE filesystem.


# cbda6f95 21-Jun-2006 John Baldwin <jhb@FreeBSD.org>

- Push down Giant in kldfind() and kldsym().
- Remove several goto's by either using direct return's or else clauses.


# 9dd44bd7 21-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Fix two comments and a style fix.


# 0df29727 21-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Various whitespace fixes.


# 62d615d5 20-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Conditionally acquire Giant around VFS operations.


# aeeb017b 20-Jun-2006 John Baldwin <jhb@FreeBSD.org>

- Push Giant down into linker_reference_module().
- Add a new function linker_release_module() as a more intuitive complement
to linker_reference_module() that wraps linker_file_unload().
linker_release_module() can either take the module name and version info
passed to linker_reference_module() or it can accept the linker file
object returned by linker_reference_module().


# f462ce3e 20-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Make linker_find_file_by_name() and linker_find_file_by_id() static to
simplify linker locking. The only external consumers now use
linker_file_foreach().


# 93215106 20-Jun-2006 John Baldwin <jhb@FreeBSD.org>

- Add a new linker_file_foreach() function that walks the list of linker
file objects calling a user-specified predicate function on each object.
The iteration terminates either when the entire list has been iterated
over or the predicate function returns a non-zero value.
linker_file_foreach() returns the value returned by the last invocation
of the predicate function. It also accepts a void * context pointer that
is passed to the predicate function as well. Using an iterator function
avoids exposing linker internals to the rest of the kernel making locking
simpler.
- Use linker_file_foreach() instead of walking the list of linker files
manually to lookup ndis files in ndis(4).
- Use linker_file_foreach() to implement linker_hwpmc_list_objects().


# aaf31705 20-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Make linker_file_add_dependency() and linker_load_module() static since
only the linker uses them.


# e767366f 20-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Don't check if malloc(M_WAITOK) returns NULL.


# e5bb3a01 20-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Use 'else' to remove another goto.


# 73a2437a 20-Jun-2006 John Baldwin <jhb@FreeBSD.org>

- Remove some useless variable initializations.
- Make some conditional free()'s where the condition was always true
unconditional.


# e1684acf 13-Jun-2006 Marcel Moolenaar <marcel@FreeBSD.org>

Unbreak 64-bit architectures. The 3rd argument to kern_kldload() is
a pointer to an integer and td->td_retval[0] is of type register_t.
On 64-bit architectures register_t is wider than an integer.


# d5388587 13-Jun-2006 John Baldwin <jhb@FreeBSD.org>

- Add a kern_kldload() that is most of the previous kldload() and push
Giant down in it.
- Push Giant down in kern_kldunload() and reorganize it slightly to avoid
using gotos. Also, expose this function to the rest of the kernel.


# 6b3d277a 13-Jun-2006 John Baldwin <jhb@FreeBSD.org>

- Push down Giant some in kldstat().
- Use a 'struct kld_file_stat' on the stack to read data under the lock
and then do one copyout() w/o holding the lock at the end to push the
data out to userland.


# b904477c 13-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Unexpand TAILQ_FOREACH() and TAILQ_FOREACH_SAFE().


# 3a600aea 13-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Remove some more pointless goto's and don't check to see if
malloc(M_WAITOK) returns NULL.


# 2fa6cc80 13-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Handle the simple case of just dropping a reference near the start of
linker_file_unload() instead of in the middle of a bunch of code for
the case of dropping the last reference to improve readability and sanity.
While I'm here, remove pointless goto's that were just jumping to a
return statement.


# e38c7f3e 27-May-2006 Xin LI <delphij@FreeBSD.org>

extlen and cpp is not used here in linker_search_kld(), so nuke them.

Reported by: Mingyan Guo <guomingyan at gmail dot com>
MFC After: 2 weeks


# 49874f6e 25-Mar-2006 Joseph Koshy <jkoshy@FreeBSD.org>

MFP4: Support for profiling dynamically loaded objects.

Kernel changes:

Inform hwpmc of executable objects brought into the system by
kldload() and mmap(), and of their removal by kldunload() and
munmap(). A helper function linker_hwpmc_list_objects() has been
added to "sys/kern/kern_linker.c" and is used by hwpmc to retrieve
the list of currently loaded kernel modules.

The unused `MAPPINGCHANGE' event has been deprecated in favour
of separate `MAP_IN' and `MAP_OUT' events; this change reduces
space wastage in the log.

Bump the hwpmc's ABI version to "2.0.00". Teach hwpmc(4) to
handle the map change callbacks.

Change the default per-cpu sample buffer size to hold
32 samples (up from 16).

Increment __FreeBSD_version.

libpmc(3) changes:

Update libpmc(3) to deal with the new events in the log file; bring
the pmclog(3) manual page in sync with the code.

pmcstat(8) changes:

Introduce new options to pmcstat(8): "-r" (root fs path), "-M"
(mapfile name), "-q"/"-v" (verbosity control). Option "-k" now
takes a kernel directory as its argument but will also work with
the older invocation syntax.

Rework string handling in pmcstat(8) to use an opaque type for
interned strings. Clean up ELF parsing code and add support for
tracking dynamic object mappings reported by a v2.0.00 hwpmc(4).

Report statistics at the end of a log conversion run depending
on the requested verbosity level.

Reviewed by: jhb, dds (kernel parts of an earlier patch)
Tested by: gallatin (earlier patch)


# 135c43dc 19-Oct-2005 John Polstra <jdp@FreeBSD.org>

Fix a bug in the kernel module runtime linker that made it impossible
to unload the usb.ko module after boot if it was originally preloaded
from "/boot/loader.conf". When processing preloaded modules, the
linker erroneously added self-dependencies the each module's reference
count. That prevented usb.ko's reference count from ever going to 0,
so it could not be unloaded.

Sponsored by Isilon Systems.

Reviewed by: pjd, peter
MFC after: 1 week


# 885fec3e 28-May-2005 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Fix panic when module is compiled in and it is loaded from loader.conf.
Only panic is fixed, module will be still listed in kldstat(8) output.
Not sure what is correct fix, because adding unloading code in case of
failure to linker_init_kernel_modules() doesn't work.


# 870fba26 28-May-2005 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Prevent loading modules with are compiled into the kernel.

PR: kern/48759
Submitted by: Pawe³ Ma³achowski <pawmal@unia.3lo.lublin.pl>
Patch from: demon
MFC after: 2 weeks


# ea2b9b3e 31-Mar-2005 John Baldwin <jhb@FreeBSD.org>

- Denote a few places where kobj class references are manipulated without
holding the appropriate lock.
- Add a comment explaining why we bump a driver's kobj class reference
when loading a module.


# 0ca311f6 26-Aug-2004 Ian Dowse <iedowse@FreeBSD.org>

When trying each linker class in turn with a preloaded module, exit
the loop if the preload was successful. Previously a successful
preload was ignored if the linker class was not the last in the
list.


# 65a311fc 13-Jul-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Give kldunload a -f(orce) argument.

Add a MOD_QUIESCE event for modules. This should return error (EBUSY)
of the module is in use.

MOD_UNLOAD should now only fail if it is impossible (as opposed to
inconvenient) to unload the module. Valid reasons are memory references
into the module which cannot be tracked down and eliminated.

When kldunloading, we abandon if MOD_UNLOAD fails, and if -force is
not given, MOD_QUIESCE failing will also prevent the unload.

For backwards compatibility, we treat EOPNOTSUPP from MOD_QUIESCE as
success.

Document that modules should return EOPNOTSUPP for unknown events.


# 39981fed 01-Jul-2004 John Baldwin <jhb@FreeBSD.org>

Trim a few things from the dmesg output and stick them under bootverbose to
cut down on the clutter including PCI interrupt routing, MTRR, pcibios,
etc.

Discussed with: USENIX Cabal


# 23eb3eb6 17-May-2004 Peter Wemm <peter@FreeBSD.org>

Since we go to the trouble of compiling the kobj ops table for each class,
and cannot handle it going away, add an explicit reference to the kobj
class inside each linker class. Without this, a class with no modules
loaded will sit with an idle refcount of 0. Loading and unloading
a module with it causes a 0->1->0 transition which frees the ops table
and causes subsequent loads using that class to explode. Normally, the
"kernel" module will remain forever loaded and prevent this happening, but
if you have more than one linker class active, only one owns the "kernel".

This finishes making modules work for kldload(8) on amd64.


# 24554d00 09-Apr-2004 Peter Edwards <peadar@FreeBSD.org>

Plug minor memory leak of module_t structures when unloading a file
from the kernel.

Reviewed By: Doug Rabson (dfr@)


# 47934cef 25-Feb-2004 Don Lewis <truckman@FreeBSD.org>

Split the mlock() kernel code into two parts, mlock(), which unpacks
the syscall arguments and does the suser() permission check, and
kern_mlock(), which does the resource limit checking and calls
vm_map_wire(). Split munlock() in a similar way.

Enable the RLIMIT_MEMLOCK checking code in kern_mlock().

Replace calls to vslock() and vsunlock() in the sysctl code with
calls to kern_mlock() and kern_munlock() so that the sysctl code
will obey the wired memory limits.

Nuke the vslock() and vsunlock() implementations, which are no
longer used.

Add a member to struct sysctl_req to track the amount of memory
that is wired to handle the request.

Modify sysctl_wire_old_buffer() to return an error if its call to
kern_mlock() fails. Only wire the minimum of the length specified
in the sysctl request and the length specified in its argument list.
It is recommended that sysctl handlers that use sysctl_wire_old_buffer()
should specify reasonable estimates for the amount of data they
want to return so that only the minimum amount of memory is wired
no matter what length has been specified by the request.

Modify the callers of sysctl_wire_old_buffer() to look for the
error return.

Modify sysctl_old_user to obey the wired buffer length and clean up
its implementation.

Reviewed by: bms


# b15572e3 23-Sep-2003 Max Khon <fjoe@FreeBSD.org>

Avoid NULL pointer dereferencing in modlist_lookup2().

PR: 56570
Submitted by: Thomas Wintergerst <Thomas.Wintergerst@nord-com.net>


# 7c89f162 27-Jul-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout.


# 677b542e 10-Jun-2003 David E. O'Brien <obrien@FreeBSD.org>

Use __FBSDID().


# 6de61153 03-Mar-2003 Ruslan Ermilov <ru@FreeBSD.org>

FreeBSD 5.0 has stopped shipping /modules 2.5 years ago. Catch
up with this further by excluding /modules from the (default)
kern.module_path.


# a163d034 18-Feb-2003 Warner Losh <imp@FreeBSD.org>

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 44956c98 21-Jan-2003 Alfred Perlstein <alfred@FreeBSD.org>

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# 7251b4bf 20-Jan-2003 Jake Burkholder <jake@FreeBSD.org>

Resolve relative relocations in klds before trying to parse the module's
metadata. This fixes module dependency resolution by the kernel linker on
sparc64, where the relocations for the metadata are different than on other
architectures; the relative offset is in the addend of an Elf_Rela record
instead of the original value of the location being patched.
Also fix printf formats in debug code.

Submitted by: Hartmut Brandt <brandt@fokus.gmd.de>
PR: 46732
Tested on: alpha (obrien), i386, sparc64


# f97182ac 14-Dec-2002 Alfred Perlstein <alfred@FreeBSD.org>

unwrap lines made short enough by SCARGS removal


# d1e405c5 13-Dec-2002 Alfred Perlstein <alfred@FreeBSD.org>

SCARGS removal take II.


# bc9e75d7 13-Dec-2002 Alfred Perlstein <alfred@FreeBSD.org>

Backout removal SCARGS, the code freeze is only "selectively" over.


# 0bbe7292 13-Dec-2002 Alfred Perlstein <alfred@FreeBSD.org>

Remove SCARGS.

Reviewed by: md5


# a3df768b 19-Nov-2002 Robert Watson <rwatson@FreeBSD.org>

Merge kld access control checks from the MAC tree: these access control
checks permit policy modules to augment the system policy for permitting
kld operations. This permits policies to limit access to kld operations
based on credential (and other) properties, as well as to perform checks
on the kld being loaded (integrity, etc).

Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories


# 3b132a61 17-Oct-2002 Sam Leffler <sam@FreeBSD.org>

fix kldload error return when a module is rejected because it's statically
linked in the kernel. When this condition is detected deep in the linker
internals the EEXIST error code that's returned is stomped on and instead
an ENOEXEC code is returned. This makes apps like sysinstall bitch.


# 7c61d785 15-Oct-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Plug a memory-leak.

"I think you're right" by: jake


# 9ca43589 15-Aug-2002 Robert Watson <rwatson@FreeBSD.org>

In order to better support flexible and extensible access control,
make a series of modifications to the credential arguments relating
to file read and write operations to cliarfy which credential is
used for what:

- Change fo_read() and fo_write() to accept "active_cred" instead of
"cred", and change the semantics of consumers of fo_read() and
fo_write() to pass the active credential of the thread requesting
an operation rather than the cached file cred. The cached file
cred is still available in fo_read() and fo_write() consumers
via fp->f_cred. These changes largely in sys_generic.c.

For each implementation of fo_read() and fo_write(), update cred
usage to reflect this change and maintain current semantics:

- badfo_readwrite() unchanged
- kqueue_read/write() unchanged
pipe_read/write() now authorize MAC using active_cred rather
than td->td_ucred
- soo_read/write() unchanged
- vn_read/write() now authorize MAC using active_cred but
VOP_READ/WRITE() with fp->f_cred

Modify vn_rdwr() to accept two credential arguments instead of a
single credential: active_cred and file_cred. Use active_cred
for MAC authorization, and select a credential for use in
VOP_READ/WRITE() based on whether file_cred is NULL or not. If
file_cred is provided, authorize the VOP using that cred,
otherwise the active credential, matching current semantics.

Modify current vn_rdwr() consumers to pass a file_cred if used
in the context of a struct file, and to always pass active_cred.
When vn_rdwr() is used without a file_cred, pass NOCRED.

These changes should maintain current semantics for read/write,
but avoid a redundant passing of fp->f_cred, as well as making
it more clear what the origin of each credential is in file
descriptor read/write operations.

Follow-up commits will make similar changes to other file descriptor
operations, and modify the MAC framework to pass both credentials
to MAC policy modules so they can implement either semantic for
revocation.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


# f2b17113 02-Aug-2002 Maxime Henrion <mux@FreeBSD.org>

Make the consumers of the linker_load_file() function use
linker_load_module() instead.

This fixes a bug where the kernel was unable to properly locate and
load a kernel module in vfs_mount() (and probably in the netgraph
code as well since it was using the same function). This is because
the linker_load_file() does not properly search the module path.

Problem found by: peter
Reviewed by: peter
Thanks to: peter


# dcbe050b 22-Jul-2002 Don Lewis <truckman@FreeBSD.org>

Pre-wire the output buffer so that sysctl_kern_function_list() doesn't
block in SYSCTL_OUT() while holding a lock.


# 31965a72 07-Jul-2002 Jeff Roberson <jeff@FreeBSD.org>

- Delay unlocking a vnode in linker_hints_lookup until we're actually done
with it.
- Remove a now stale comment about improper vnode locking.


# 2eb7b21b 19-Jun-2002 Andrew R. Reiter <arr@FreeBSD.org>

- Remove the lock(9) protecting the kernel linker system.
- Added a mutex, kld_mtx, to protect the kernel_linker system. Note that
while ``classes'' is global (to that file), it is only read only after
SI_SUB_KLD, SI_ORDER_ANY.
- Add a SYSINIT to flip a flag that disallows class registration after
SI_SUB_KLD, SI_ORDER_ANY.

Idea for ``classes'' read only by: jake
Reviewed by: jake


# b94c4e9a 26-Apr-2002 Brian Somers <brian@FreeBSD.org>

Test if rootvnode is NULL rather than if rootdev is NODEV when determining
if there's a filesystem present.

rootdev can be NODEV in the NFS-mounted root scenario.

Discussed with: Harti Brandt <brandt@fokus.gmd.de>, iedowse


# f1e4a6e9 09-Apr-2002 Brian Somers <brian@FreeBSD.org>

In linker_load_module(), check that rootdev != NODEV before calling
linker_search_module().

Without this, modules loaded from loader.conf that then try to load
in additional modules (such as digi.ko loading a card's BIOS) die
badly in the vn_open() called from linker_search_module().

It may be worth checking (KASSERTing?) that rootdev != NODEV in
vn_open() too.


# 96987c74 09-Apr-2002 Brian Somers <brian@FreeBSD.org>

Change linker_reference_module() so that it's passed a struct
mod_depend * (which may be NULL). The only consumer of this
function at the moment is digi_loadmoduledata(), and that passes
a NULL mod_depend *.

In linker_reference_module(), check to see if we've already got
the required module loaded. If we have, bump the reference count
and return that, otherwise continue the module search as normal.


# 44731cab 01-Apr-2002 John Baldwin <jhb@FreeBSD.org>

Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on: smp@


# 517f30c2 25-Mar-2002 Andrew R. Reiter <arr@FreeBSD.org>

- Recommit the securelevel_gt() calls removed by commits rev. 1.84 of
kern_linker.c and rev. 1.237 of vfs_syscalls.c since these are not the
source of the recent panics occuring around kldloading file system
support modules.

Requested by: rwatson


# fe3240e9 21-Mar-2002 Andrew R. Reiter <arr@FreeBSD.org>

- Back out the commit to make the linker_load_file() securelevel check
made aware in jail environments. Supposedly something is broken, so
this should be backed out until further investigation proves otherwise,
or a proper fix can be provided.


# e85b9ae9 21-Mar-2002 Andrew R. Reiter <arr@FreeBSD.org>

- Fix a logic error in checking the securelevel that was introduced in the
previous commit.

Pointy hats to: arr, rwatson


# c457a440 20-Mar-2002 Andrew R. Reiter <arr@FreeBSD.org>

- Change a check of securelevel to securelevel_gt() call in order to help
against users within a jail attempting to load kernel modules.
- Add a check of securelevel_gt() to vfs_mount() in order to chop some
low hanging fruit for the repair of securelevel checking of linking and
unlinking files from within jails. There is more to be done here.

Reviewed by: rwatson


# 08a54da7 19-Mar-2002 Andrew R. Reiter <arr@FreeBSD.org>

- Change a malloc / bzero pair to make use of the M_ZERO malloc(9) flag.


# 9b3851e9 18-Mar-2002 Andrew R. Reiter <arr@FreeBSD.org>

- Lock down the ``module'' structure by adding an SX lock that is used by
all the global bits of ``module'' data. This commit adds a few generic
macros, MOD_SLOCK, MOD_XLOCK, etc., that are meant to be used as ways
of accessing the SX lock. It is also the first step in helping to lock
down the kernel linker and module systems.

Reviewed by: jhb, jake, smp@


# 6c75a65a 10-Mar-2002 David Malone <dwmalone@FreeBSD.org>

Don't assign strcmp to a variable called err and then compare it
with zero, just compare strcmp with zero. This fixes the same bug
which Maxim just fixed and fixes some odd style too.

PR: 35712
Reviewed by: arr


# 832af2d5 10-Mar-2002 Maxim Sobolev <sobomax@FreeBSD.org>

Fix a breakage introduced in rev.1.75 (supposedly style cleanup), which results
in "missing dependencies" error when loading some kld modules. It is sad to
see how often these days style cleanus break doesn't broken things. Perhaps
people should recall good old principle: "don't fix it if it isn't broken".


# a854ed98 27-Feb-2002 John Baldwin <jhb@FreeBSD.org>

Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.


# e68baa70 22-Feb-2002 Andrew R. Reiter <arr@FreeBSD.org>

- Whitespace fixes leftover from previous commit.

Submitted by: bde


# 8e92b63c 21-Feb-2002 Andrew R. Reiter <arr@FreeBSD.org>

- Massive style fixup.

Reviewed by: mike
Approved by: dfr


# 894c9fe0 10-Feb-2002 Robert Watson <rwatson@FreeBSD.org>

Add a comment indicating that the vnode locking in this section of the
kernel linker code may be wrong: it fails to hold a lock across the
call to VOP_GETATTR(), and vn_rdwr() with IO_NODELOCKED.


# b489b407 18-Nov-2001 Andrew R. Reiter <arr@FreeBSD.org>

- Ensure that linker file id's are unique, rather than blindly
incrementing the value.

Reviewed by: dfr, peter


# 7b9716ba 16-Nov-2001 Ian Dowse <iedowse@FreeBSD.org>

Fix a number of misspellings of "dependency" and "dependencies" in
comments and function names.

PR: kern/8589
Submitted by: Rajesh Vaidheeswarran <rv@fore.com>


# fc5d29ef 01-Nov-2001 Robert Watson <rwatson@FreeBSD.org>

o Move suser() calls in kern/ to using suser_xxx() with an explicit
credential selection, rather than reference via a thread or process
pointer. This is part of a gradual migration to suser() accepting
a struct ucred instead of a struct proc, simplifying the reference
and locking semantics of suser().

Obtained from: TrustedBSD Project


# bb9fe9dd 30-Oct-2001 Brian Feldman <green@FreeBSD.org>

Add the sysctl "kern.function_list", which currently exports all
function symbols in the kernel in a list of C strings, with an extra
nul-termination at the end.

This sysctl requires addition of a new linker operation. Now,
linker_file_t's need to respond to "each_function_name" to export
their function symbols.

Note that the sysctl doesn't currently allow distinguishing multiple
symbols with the same name from different modules, but could quite
easily without a change to the linker operation. This will be a nicety
to have when it can be used.

Obtained from: NAI Labs CBOSS project
Funded by: DARPA


# b40ce416 12-Sep-2001 Julian Elischer <julian@FreeBSD.org>

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 8ee6d9e9 11-Sep-2001 Peter Wemm <peter@FreeBSD.org>

Fix the kern.module_path issue that required the trailing '/' character
on each module path component. Fix a one-byte buffer overflow at the
same time that got highlighted in the process.


# 505222d3 10-Sep-2001 Peter Wemm <peter@FreeBSD.org>

Implement the long-awaited module->file cache database. A userland
tool (kldxref(8)) keeps a cache of what modules and versions are inside
what .ko files. I have tested this on both Alpha and i386.

Submitted by: bp


# 835a82ee 01-Sep-2001 Matthew Dillon <dillon@FreeBSD.org>

Giant Pushdown. Saved the worst P4 tree breakage for last.

reboot() getpriority() setpriority() rtprio() osetrlimit() ogetrlimit()
setrlimit() getrlimit() getrusage() getpid() getppid() getpgrp()
getpgid() getsid() getgid() getegid() getgroups() setsid() setpgid()
setuid() seteuid() setgid() setegid() setgroups() setreuid() setregid()
setresuid() setresgid() getresuid() getresgid () __setugid() getlogin()
setlogin() modnext() modfnext() modstat() modfind() kldload() kldunload()
kldfind() kldnext() kldstat() kldfirstmod() kldsym() getdtablesize()
dup2() dup() fcntl() close() ofstat() fstat() nfsstat() fpathconf()
flock()


# fcd7e670 19-Aug-2001 Dima Dorfman <dd@FreeBSD.org>

Sync the default module search path with the one in
sys/boot/common/module.c.

PR: 21405
Submitted by: Makoto MATSUSHITA <matusita@jp.FreeBSD.org>


# 98b0e9d5 30-Jul-2001 Jake Burkholder <jake@FreeBSD.org>

Don't try to print a field that doesn't exist; in usually commented
out debugging code.


# 0e79fe6f 20-Jun-2001 Dag-Erling Smørgrav <des@FreeBSD.org>

Constify (silence warnings introduced by last commit to sys/module.h)


# 09dbb404 18-Jun-2001 Brian Somers <brian@FreeBSD.org>

Add linker_reference_module().

This function loads a module if required, otherwise bumps the reference
count -- the opposite of linker_file_unload().


# f41325db 13-Jun-2001 Peter Wemm <peter@FreeBSD.org>

With this commit, I hereby pronounce gensetdefs past its use-by date.

Replace the a.out emulation of 'struct linker_set' with something
a little more flexible. <sys/linker_set.h> now provides macros for
accessing elements and completely hides the implementation.

The linker_set.h macros have been on the back burner in various
forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()),
John Polstra (ELF clue) and myself (cleaned up API and the conversion
of the rest of the kernel to use it).

The macros declare a strongly typed set. They return elements with the
type that you declare the set with, rather than a generic void *.

For ELF, we use the magic ld symbols (__start_<setname> and
__stop_<setname>). Thanks to Richard Henderson <rth@redhat.com> for the
trick about how to force ld to provide them for kld's.

For a.out, we use the old linker_set struct.

NOTE: the item lists are no longer null terminated. This is why
the code impact is high in certain areas.

The runtime linker has a new method to find the linker set
boundaries depending on which backend format is in use.

linker sets are still module/kld unfriendly and should never be used
for anything that may be modular one day.

Reviewed by: eivind


# 81930014 06-Jun-2001 Peter Wemm <peter@FreeBSD.org>

Make the TUNABLE_*() macros look and behave more consistantly like the
SYSCTL_*() macros. TUNABLE_INT_DECL() was an odd name because it didn't
actually declare the int, which is what the name suggests it would do.


# a91f68bc 22-Mar-2001 Boris Popov <bp@FreeBSD.org>

o Actually extract version of interface and store it along with the name.

o Add new parameter to the modlist_lookup() function to perform lookups
with strict version matching.

o Collapse duplicate code to function(s).


# 303b15f1 22-Mar-2001 Boris Popov <bp@FreeBSD.org>

Slightly reorganize code in the linker_load_dependancies() function to make
codepath more straightforward.


# 804f2729 22-Mar-2001 Boris Popov <bp@FreeBSD.org>

Remove support for old way of handling module dependencies.

Approved by: peter


# 37d40066 04-Feb-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Another round of the <sys/queue.h> FOREACH transmogriffer.

Created with: sed(1)
Reviewed by: md5(1)


# fc2ffbe6 04-Feb-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Mechanical change to use <sys/queue.h> macro API instead of
fondling implementation details.

Created with: sed(1)
Reviewed by: md5(1)


# 4058c0f0 28-Dec-2000 Peter Wemm <peter@FreeBSD.org>

Pull out the module path from the loader. ie: if you boot from
/boot/kernel.foobar/* then that had better be in the path ahead of the
others.

Submitted by: Daniel J. O'Connor <darius@dons.net.au>
PR: 23662


# 7cc0979f 08-Dec-2000 David Malone <dwmalone@FreeBSD.org>

Convert more malloc+bzero to malloc+M_ZERO.

Submitted by: josh@zipperup.org
Submitted by: Robert Drehmel <robd@gmx.net>


# c9b00477 04-Oct-2000 Doug Rabson <dfr@FreeBSD.org>

Add a workaround for statically linked kernels.


# 6b6821c7 06-Sep-2000 David E. O'Brien <obrien@FreeBSD.org>

The kernel is now known as `kernel.ko' and it and its matching modules
live in ``/boot/kernel/''.

Submitted by: Hisashi Hiramoto <hiramoto@phys.chs.nihon-u.ac.jp>


# 2c7f8b4e 02-Aug-2000 Peter Wemm <peter@FreeBSD.org>

Fix self referential dependencies. eg: uhub was packaged along with
usb, all in usb.ko. uhub depends on usb. The bug was that the preload
processing only adds a module to the list once it's internal dependencies
are resolved... Since it was not "seeing" the internal usb module it
believed that uhub had a missing dependency.


# 2ff08731 09-Jul-2000 Boris Popov <bp@FreeBSD.org>

Correct SYSINIT execution order in the case when KLD contains more
than one SYSINIT with the same 'subsystem' id and different 'order' id.

Reviewed by: peter


# e6796b67 03-Jul-2000 Kirk McKusick <mckusick@FreeBSD.org>

Move the truncation code out of vn_open and into the open system call
after the acquisition of any advisory locks. This fix corrects a case
in which a process tries to open a file with a non-blocking exclusive
lock. Even if it fails to get the lock it would still truncate the
file even though its open failed. With this change, the truncation
is done only after the lock is successfully acquired.

Obtained from: BSD/OS


# 6c66bbed 29-Jun-2000 Archie Cobbs <archie@FreeBSD.org>

Move the securelevel check before loading KLD's into linker_load_file(),
instead of requiring every caller of linker_load_file() to perform the
check itself. This avoids netgraph loading KLD's when securelevel > 0,
not to mention any future code that may call linker_load_file().

Reviewed by: dfr


# e3975643 25-May-2000 Jake Burkholder <jake@FreeBSD.org>

Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by: msmith and others


# 740a1973 23-May-2000 Jake Burkholder <jake@FreeBSD.org>

Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by: phk
Reviewed by: phk
Approved by: mdodd


# 2c9b67a8 30-Apr-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unneeded #include <vm/vm_zone.h>

Generated by: src/tools/tools/kerninclude


# 54823af2 29-Apr-2000 Peter Wemm <peter@FreeBSD.org>

First round implementation of a fine grain enhanced module to module
version dependency system. This isn't quite finished, but it is at a
useful stage to do a functional checkpoint.

Highlights:
- version and dependency metadata is gathered via linker sets, so things
are handled the same for static kernels and code built to live in a kld.
- The dependencies are at module level (versus at file level).
- Dependencies determine kld symbol search order - this means that you
cannot link against symbols in another file unless you depend on it. This
is so that you cannot accidently unload the target out from underneath
the ones referencing it.
- It is flexible enough that we can put tags in #include files and macros
so that we can get decent hooks for enforcing recompiles on incompatable
ABI changes. eg: if we change struct proc, we could force a recompile
for all kld's that reference the proc struct.
- Tangled dependency references at boot time are sorted. Files are
relocated once all their dependencies are already relocated.

Caveats:
- Loader support is incomplete, but has been worked on seperately.
- Actual enforcement of the version number tags is not active yet - just
the module dependencies are live. The actual structure of versioning
hasn't been agreed on yet. (eg: major.minor, or whatever)
- There is some backwards compatability for old modules without metadata
but I'm not sure how good it is.

This is based on work originally done by Boris Popov (bp@freebsd.org),
but I'm not sure he'd recognize much of it now. Don't blame him. :-)
Also, ideas have been borrowed from Mike Smith.


# 326e27d8 24-Apr-2000 Doug Rabson <dfr@FreeBSD.org>

* Rewrite to use kobj(9) instead of hard-coded function tables.
* Report link errors to stdout with uprintf() so that the user can see
what went wrong (PR kern/9214).
* Add support code to allow module symbols to be loaded into GDB using
the debugger's "sharedlibrary" command.


# 762e6b85 15-Dec-1999 Eivind Eklund <eivind@FreeBSD.org>

Introduce NDFREE (and remove VOP_ABORTOP)


# 45371389 10-Dec-1999 Peter Wemm <peter@FreeBSD.org>

Zap c_index() and c_rindex(). Bruce prefers these to implicitly convert
a const into a non-const as they do in libc. I feel that defeating the
type checking like that quite evil, but that's the way it is.


# 95dc37f6 20-Nov-1999 Peter Wemm <peter@FreeBSD.org>

Tempt fate and stop index from converting a const char * into a char *.
I've made a seperate version (c_index() etc) that use const/const, but
I'm not sure it's worth it considering there is one file in the tree
that uses index on const strings (kern_linker.c) and it's easily adjusted
to scan the strings directly (and is perhaps more efficient that way).


# d1f088da 11-Oct-1999 Peter Wemm <peter@FreeBSD.org>

Trim unused options (or #ifdef for undoc options).

Submitted by: phk


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# 0921e488 23-Aug-1999 Bruce Evans <bde@FreeBSD.org>

Cast pointers to uintptr_t instead of casting them to u_long. They
are still converted to u_long by assignment of the uintptr_t, and
address calculations are still done using u_long. This is OK for
currently supported machines, but addresses should be represented
by vm_offset_t or uintptr_t in case pointers are longer than longs.

"Fixed" size of linker_path[]. MAXPATHLEN + 1 was 1 too large for
search paths with only one file path in them, but much too small
for search paths with several long file paths in them.


# 4033a962 19-Aug-1999 Greg Lehey <grog@FreeBSD.org>

Change the name of the static variable 'files' to 'linker_files' in
order to be able to refer to it uniquely from the kernel debugger.

Approved-by: peter


# 9c8b8baa 01-Jul-1999 Peter Wemm <peter@FreeBSD.org>

Slight reorganization of kernel thread/process creation. Instead of using
SYSINIT_KT() etc (which is a static, compile-time procedure), use a
NetBSD-style kthread_create() interface. kproc_start is still available
as a SYSINIT() hook. This allowed simplification of chunks of the
sysinit code in the process. This kthread_create() is our old kproc_start
internals, with the SYSINIT_KT fork hooks grafted in and tweaked to work
the same as the NetBSD one.

One thing I'd like to do shortly is get rid of nfsiod as a user initiated
process. It makes sense for the nfs client code to create them on the
fly as needed up to a user settable limit. This means that nfsiod
doesn't need to be in /sbin and is always "available". This is a fair bit
easier to do outside of the SYSINIT_KT() framework.


# df8abd0b 30-Jun-1999 Peter Wemm <peter@FreeBSD.org>

Slight tweak to fork1() calling conventions. Add a third argument so
the caller can easily find the child proc struct. fork(), rfork() etc
syscalls set p->p_retval[] themselves. Simplify the SYSINIT_KT() code
and other kernel thread creators to not need to use pfind() to find the
child based on the pid. While here, partly tidy up some of the fork1()
code for RF_SIGSHARE etc.


# b5b15c3f 08-May-1999 Peter Wemm <peter@FreeBSD.org>

First stages of a module dependency cleanup. This part fixes a
particularly annoying hack, namely having the linker bash the moduledata
to set the container pointer, preventing it being const. In the process,
a stack of warnings were fixed and will probably allow a revisit of the
const C_SYSINIT() changes. This explicitly registers modules in files or
preload areas with the module system first, and let them initialize via
SYSINIT/DECLARE_MODULE later in their SI_ORDER_xxx order. The kludge of
finding the containing file is no longer needed since the registration
of modules onto the modules list is done in the context of initializing
the linker file.


# 5206bca1 27-Apr-1999 Luoqi Chen <luoqi@FreeBSD.org>

Enable vmspace sharing on SMP. Major changes are,
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
is added to point to per-cpu pages, per-cpu global variables are now
accessed through this new selector (%fs). The selectors in gdt table are
rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.

Reviewed by: Alan Cox <alc@cs.rice.edu>
John Dyson <dyson@iquest.net>
Julian Elischer <julian@whistel.com>
Bruce Evans <bde@zeta.org.au>
David Greenman <dg@root.com>


# f711d546 27-Apr-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Suser() simplification:

1:
s/suser/suser_xxx/

2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.


# 88b4f4ee 05-Apr-1999 Peter Wemm <peter@FreeBSD.org>

LK_RETRY is a vn_lock() flag, not one for lockmgr().


# a199ed3c 07-Mar-1999 Doug Rabson <dfr@FreeBSD.org>

* Register sysctl nodes before running sysinits when loading files and
unregister them after sysuninits when unloading.
* Add code to vfs_register() to set the oid number of vfs sysctls to
the type number of the filesystem.

Reviewed by: bde


# 75e08a5e 20-Feb-1999 Doug Rabson <dfr@FreeBSD.org>

A correction to the code which attempts to prevent the same module
being loaded twice. It used rindex() to strip the pathname but failed
to account for the fact that rindex() will return a pointer to the '/',
not the first character of the filename.

Submitted by: Nick Hibma <hibma@skylink.it>


# ce02431f 16-Feb-1999 Doug Rabson <dfr@FreeBSD.org>

* Change sysctl from using linker_set to construct its tree using SLISTs.
This makes it possible to change the sysctl tree at runtime.

* Change KLD to find and register any sysctl nodes contained in the loaded
file and to unregister them when the file is unloaded.

Reviewed by: Archie Cobbs <archie@whistle.com>,
Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)


# fe08c21a 27-Jan-1999 Matthew Dillon <dillon@FreeBSD.org>

Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile.

This commit includes significant work to proper handle const arguments
for the DDB symbol routines.


# d254af07 27-Jan-1999 Matthew Dillon <dillon@FreeBSD.org>

Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile


# 149a155c 25-Jan-1999 Doug Rabson <dfr@FreeBSD.org>

Don't try to call SYSUNINIT functions if there was a link error.

Reviewed by: Peter Wemm <peter@netplex.com.au>


# 461b36ab 22-Jan-1999 Peter Wemm <peter@FreeBSD.org>

Update userref handling after discussion with submitter of previous
patch. lf can't be dereferenced after the unload attempt, in case it
was freed. Instead, decrement first and back it out if the unload failed.
This should be relatively immune to races caused by the user since the
userref count will be zero for the duration of the actual unloading and
will stop further kldunload attempts.

Submitted by: Ustimenko Semen <semen@iclub.nsu.ru>


# d7dfdda2 19-Jan-1999 Peter Wemm <peter@FreeBSD.org>

Relax linkage symbol scope restrictions to be more compatable with that
of shared libraries.


# e75a9dc0 19-Jan-1999 Peter Wemm <peter@FreeBSD.org>

Don't decrement userrefs unless the file was actually was unloaded.

Submitted by: Ustimenko Semen <semen@iclub.nsu.ru>


# e99f57c3 17-Jan-1999 Peter Wemm <peter@FreeBSD.org>

Try and clean up the multiple formal loading support a bit, based on
suggestions from Greg Lehey some time ago. In the face of multiple
potential file formats, try and give a more sensible error than just
ENOEXEC.

XXX a good case can be made that the loading process is wrong - the linker
should locate the file first (using the search paths etc), then run the
loaders to see if they recognize it. While the present system allows for
the possibility of different search paths for different formats, we do not
use it and it just makes things more complicated than they need to be.


# f1b26522 05-Jan-1999 Mike Smith <msmith@FreeBSD.org>

Don't allow more than one module with the same name to be loaded.
Make kldfind ignore the path when searching for a loaded module.

Submitted by: John Birrell (jb@freebsd.org)


# ba031106 11-Nov-1998 Peter Wemm <peter@FreeBSD.org>

kldsym(2) prototype implementation


# edfbe150 10-Nov-1998 Peter Wemm <peter@FreeBSD.org>

Arrange for unload-time linker set hooks to be called. While cut/pasting
some code, I changed the original to be consistant with the rest of the
file rather than duplicating the problems.


# 21ce23eb 06-Nov-1998 Peter Wemm <peter@FreeBSD.org>

Define the kld_debug variable if KLD_DEBUG is enabled


# 84e40f56 04-Nov-1998 Peter Wemm <peter@FreeBSD.org>

The handle for the kernel is common. With this fix, ELF kernels can load
a.out kld modules, and a.out kernels can load ELF kld modules.


# 78377454 03-Nov-1998 Peter Wemm <peter@FreeBSD.org>

Have the in-kernel linker try a default extension of .ko. This means that
"kldload nfs" works. We use the same default extension in the /boot/loader
system.


# b913711e 03-Nov-1998 Peter Wemm <peter@FreeBSD.org>

Use the kvm space pathname that we copied in, not the one in user space.


# 6fe8861e 24-Oct-1998 Mike Smith <msmith@FreeBSD.org>

Don't put 0x in front of %p, it does it already.
Submitted by: Brian Feldman <green@janus.syracuse.net>


# bd4e381b 15-Oct-1998 Peter Wemm <peter@FreeBSD.org>

- bzero() after malloc(). This is especially obvious when kern_malloc is
compiled with DIAGNOSTIC.
- Don't break from the preload module processing loop prematurely.


# 26deceba 09-Oct-1998 Peter Wemm <peter@FreeBSD.org>

Display module type as well as module name when we find one preloaded.


# 51f3fe7a 09-Oct-1998 Peter Wemm <peter@FreeBSD.org>

Use Mike Smith's linker module search path code.
Implement preloading in a fairly MI way, assuming the information is
prepared.
DDB interface helpers.. Provide some support for db_kld.c so that we
don't have to export too much detail.
Debugging and cosmetic nits left in from development..
The other half of the containing file hack so modules can associate
themselves with their "file".


# a2c99e3e 12-Aug-1998 Doug Rabson <dfr@FreeBSD.org>

Modify the internal interfaces to the kernel linker to make it possible
for DDB to use its symbol tables.


# b1679c0f 01-Jan-1998 Bruce Evans <bde@FreeBSD.org>

Use a real malloc type for M_LINKER instead of #defining it as M_TEMP.

Fixed a comment.


# 74b2192a 11-Dec-1997 John Dyson <dyson@FreeBSD.org>

We have had support for running the kernel daemons as threads for
quite a while, but forgot to do so. For now, this code supports
most daemons running as kernel threads in UP kernels, and as
full processes in SMP. We will soon be able to run them as
threads in SMP, but not yet.


# d73424aa 20-Nov-1997 Bruce Evans <bde@FreeBSD.org>

Fixed a sloppy common-style definitions.


# cb226aaa 06-Nov-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.


# 1fd0b058 02-Aug-1997 Bruce Evans <bde@FreeBSD.org>

Removed unused #includes.


# cea6c86c 07-May-1997 Doug Rabson <dfr@FreeBSD.org>

This is the kernel linker. To use it, you will first need to apply
the patches in freefall:/home/dfr/ld.diffs to your ld sources and set
BINFORMAT to aoutkld when linking the kernel.

Library changes and userland utilities will appear in a later commit.