History log of /freebsd-current/sys/sys/conf.h
Revision Date Author Comments
# 2aee804c 27-Mar-2024 Stephen J. Kiernan <stevek@FreeBSD.org>

kerneldump: Add flag to indicate kernel core was successfully dumped

This allows for shutdown_final EVENTHANDLERs to know that a core dump
successfully occurred. Embedded systems may want to record this fact
or act on it.

Obtained from: Juniper Networks, Inc.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44542


# d3efbe01 21-Mar-2024 Konstantin Belousov <kib@FreeBSD.org>

cdevpriv(9): add iterator

Reviewed by: christos
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44469


# 29363fb4 23-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by: Netflix


# 2ff63af9 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .h pattern

Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/


# c3179891 20-Mar-2023 Mark Johnston <markj@FreeBSD.org>

kerneldump: Inline dump_savectx() into its callers

The callers of dump_savectx() (i.e., doadump() and livedump_start())
subsequently call dumpsys()/minidumpsys(), which dump the calling
thread's stack when writing the dump. If dump_savectx() gets its own
stack frame, that frame might be clobbered when its caller later calls
dumpsys()/minidumpsys(), making it difficult for debuggers to unwind the
stack.

Fix this by making dump_savectx() a macro, so that savectx() is always
called directly by the function which subsequently calls
dumpsys()/minidumpsys().

This fixes stack unwinding for the panicking thread from arm64
minidumps. The same happened to work on amd64, but kgdb reports the
dump_savectx() calls as coming from dumpsys(), so in that case it
appears to work by accident.

Fixes: c9114f9f86f9 ("Add new vnode dumper to support live minidumps")
Reviewed by: mhorne, jhb
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D39151


# 497240de 19-Aug-2022 Mateusz Guzik <mjg@FreeBSD.org>

Retire clone_drain_lock

It is only ever xlocked in drain_dev_clone_events and the only consumer of
that routine does not need it -- eventhandler code already makes sure the
relevant callback is no longer running.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D36268


# db71383b 13-May-2022 Mitchell Horne <mhorne@FreeBSD.org>

kerneldump: remove physical from dump routines

It is unused, especially now that the underlying d_dumper methods do not
accept the argument.

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D35174


# 489ba222 13-May-2022 Mitchell Horne <mhorne@FreeBSD.org>

kerneldump: remove physical argument from d_dumper

The physical address argument is essentially ignored by every dumper
method. In addition, the dump routines don't actually pass a real
address; every call to dump_append() passes a value of zero for
physical.

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D35173


# 0f50da2e 13-May-2022 Mitchell Horne <mhorne@FreeBSD.org>

Drop d_dump from struct cdevsw

It appears to be unused. These days struct disk has a d_dump member,
which is what gets passed to the kernel dump framework.

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D35172


# c9114f9f 23-Mar-2021 Mitchell Horne <mhorne@FreeBSD.org>

Add new vnode dumper to support live minidumps

This dumper can instantiate and write the dump's contents to a
file-backed vnode.

Unlike existing disk or network dumpers, the vnode dumper should not be
invoked during a system panic, and therefore is not added to the global
dumper_configs list. Instead, the vnode dumper is constructed ad-hoc
when a live dump is requested using the new ioctl on /dev/mem. This is
similar in spirit to a kgdb session against the live system via
/dev/mem.

As described briefly in the mem(4) man page, live dumps are not
guaranteed to result in a usuable output file, but offer some debugging
value where forcefully panicing a system to dump its memory is not
desirable/feasible.

A future change to savecore(8) will add an option to save a live dump.

Reviewed by: markj, Pau Amma <pauamma@gundo.com> (manpages)
Discussed with: kib
MFC after: 3 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D33813


# 59c27ea1 09-Aug-2021 Mitchell Horne <mhorne@FreeBSD.org>

Split out dumper allocation from list insertion

Add a new function, dumper_create(), to allocate a dumper.
dumper_insert() will call this function and retains the existing
behaviour.

This is desirable for performing live dumps of the system. Here, there
is a need to allocate and configure a dumper structure that is invoked
outside of the typical debugger context. Therefore, it should be
excluded from the list of panic-time dumpers.

free_single_dumper() is made public and renamed to dumper_destroy().

Reviewed by: kib, markj
MFC after: 1 week
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34068


# a9545eed 09-Dec-2021 Florian Walpen <dev@submerge.ch>

Add idle priority scheduling privilege group to MAC/priority

Add an idletime user group that allows non-root users to run processes
with idle scheduling priority. Privileges are granted by a MAC policy in
the mac_priority module. For this purpose, the kernel privilege
PRIV_SCHED_IDPRIO was added to sys/priv.h (kernel module ABI change).

Deprecate the system wide sysctl(8) knob
security.bsd.unprivileged_idprio which lets any user run idle priority
processes, regardless of context. While the knob is still working, it is
marked as deprecated in the description and in the man pages.

MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D33338


# bf2fa8d9 04-Dec-2021 Florian Walpen <dev@submerge.ch>

MAC/priority module for realtime privilege group

This is a MAC policy module that grants scheduling privileges based on
group membership. Users or processes in the group realtime (gid 47) are
allowed to run threads and processes with realtime scheduling priority.
For timing-sensitive, low-latency software like audio/jack, running with
realtime priority helps to avoid stutter and gaps.

PR: 239125
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D33191


# 13a58148 06-Aug-2021 Eric van Gyzen <vangyzen@FreeBSD.org>

netdump: send key before dump, in case dump fails

Previously, if an encrypted netdump failed, such as due to a timeout or
network failure, the key was not saved, so a partial dump was
completely useless.

Send the key first, so the partial dump can be decrypted, because even a
partial dump can be useful.

Reviewed by: bdrewery, markj
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D31453


# fbeb4cca 06-Jun-2021 Mark Johnston <markj@FreeBSD.org>

Suppress D_NEEDGIANT warnings for some drivers

During boot we warn that the kbd and openfirm drivers are Giant-locked
and may be deleted. Generally, the warning helps signal that certain
old drivers are not being maintained and are subject to removal, but
this doesn't really apply to certain drivers which are harder to
detangle from Giant.

Add a flag, D_GIANTOK, that devices can specify to suppress the
misleading warning. Use it in the kbd and openfirm drivers.

Reviewed by: imp, jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30649


# f6e54eb3 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

sys: clean up empty lines in .c and .h files


# 2ed5e423 15-Jun-2020 Rick Macklem <rmacklem@FreeBSD.org>

Expose UID_xxx and GID_xxx definitions to userspace.

This patch moves the UID_xxx and GID_xxx definitions out of the
#ifdef _KERNEL section, so that userspace programs like mountd
can use them.
There are a couple of userspace programs that do define UID_ROOT,
but they do not include sys/conf.h. Since they are defined as
the same value, maybe they should be changed to include sys/conf.h.

Reviewed by: kib
Differential Revision: https:/reviews.freebsd.org/D25281


# fe20aaec 22-Feb-2020 Ryan Libby <rlibby@FreeBSD.org>

sys/kern: quiet -Wwrite-strings

Quiet a variety of Wwrite-strings warnings in sys/kern at low-impact
sites. This patch avoids addressing certain others which would need to
plumb const through structure definitions.

Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D23798


# 1fccb43c 19-Nov-2019 Mateusz Guzik <mjg@FreeBSD.org>

vfs: change si_usecount management to count used vnodes

Currently si_usecount is effectively a sum of usecounts from all associated
vnodes. This is maintained by special-casing for VCHR every time usecount is
modified. Apart from complicating the code a little bit, it has a scalability
impact since it forces a read from a cacheline shared with said count.

There are no consumers of the feature in the ports tree. In head there are only
2: revoke and devfs_close. Both can get away with a weaker requirement than the
exact usecount, namely just the count of active vnodes. Changing the meaning to
the latter means we only need to modify it on 0<->1 transitions, avoiding the
check plenty of times (and entirely in something like vrefact).

Reviewed by: kib, jeff
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22202


# addccb8c 17-Oct-2019 Conrad Meyer <cem@FreeBSD.org>

Add a very limited DDB dumpon(8)-alike to MI dumper code

This allows ddb(4) commands to construct a static dumperinfo during
panic/debug and invoke doadump(false) using the provided dumper
configuration (always inserted first in the list).

The intended usecase is a ddb(4)-time netdump(4) command.

Reviewed by: markj (earlier version)
Differential Revision: https://reviews.freebsd.org/D21448


# e3f35d56 05-Oct-2019 Kyle Evans <kevans@FreeBSD.org>

Remove the remnants of SI_CHEAPCLONE

SI_CHEAPCLONE was introduced in r66067 for use with cloned bpfs. It was
later also used in tty, tun, tap at points. The rough timeline for being
removed in each of these is as follows:

- r181690: bpf switched to use cdevpriv API by ed@
- r181905: ed@ rewrote the TTY later to be mpsafe
- r204464: kib@ removes it from tun/tap, declaring it unused

I've not yet been able to dig up any other consumers in the intervening 9
years. It is no longer set on any devices in the tree and leaves an
interesting situation in make_dev_sv where we're ok with the device already
being set SI_NAMED.


# e2e050c8 19-May-2019 Conrad Meyer <cem@FreeBSD.org>

Extract eventfilter declarations to sys/_eventfilter.h

This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions. The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended). Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped.


# 6b6e2954 06-May-2019 Conrad Meyer <cem@FreeBSD.org>

List-ify kernel dump device configuration

Allow users to specify multiple dump configurations in a prioritized list.
This enables fallback to secondary device(s) if primary dump fails. E.g.,
one might configure a preference for netdump, but fallback to disk dump as a
second choice if netdump is unavailable.

This change does not list-ify netdump configuration, which is tracked
separately from ordinary disk dumps internally; only one netdump
configuration can be made at a time, for now. It also does not implement
IPv6 netdump.

savecore(8) is already capable of scanning and iterating multiple devices
from /etc/fstab or passed on the command line.

This change doesn't update the rc or loader variables 'dumpdev' in any way;
it can still be set to configure a single dump device, and rc.d/savecore
still uses it as a single device. Only dumpon(8) is updated to be able to
configure the more complicated configurations for now.

As part of revving the ABI, unify netdump and disk dump configuration ioctl
/ structure, and leave room for ipv6 netdump as a future possibility.
Backwards-compatibility ioctls are added to smooth ABI transition,
especially for developers who may not keep kernel and userspace perfectly
synced.

Reviewed by: markj, scottl (earlier version)
Relnotes: maybe
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D19996


# e5ac3049 26-Jan-2019 Konstantin Belousov <kib@FreeBSD.org>

Bump SPECNAMELEN to MAXNAMLEN.

This includes the bump for cdevsw d_version. Otherwise, the impact on
the ABI (not KBI) is surprisingly low. The most important affected
interface is devname(3) and ttyname(3) which already correctly handle
long names (and ttyname(3) should not be affected at all).

Still, due to the d_version bump, I argue that the change is not MFC-able.

Requested by: mmacy
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18932


# bd92e6b6 05-May-2018 Mark Johnston <markj@FreeBSD.org>

Refactor some of the MI kernel dump code in preparation for netdump.

- Add clear_dumper() to complement set_dumper().
- Drain netdump's preallocated mbuf pool when clearing the dumper.
- Don't do bounds checking for dumpers with mediasize 0.
- Add dumper callbacks for initialization for writing out headers.

Reviewed by: sbruno
MFC after: 1 month
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D15252


# 78f57a9c 08-Jan-2018 Mark Johnston <markj@FreeBSD.org>

Generalize the gzio API.

We currently use a set of subroutines in kern_gzio.c to perform
compression of user and kernel core dumps. In the interest of adding
support for other compression algorithms (zstd) in this role without
complicating the API consumers, add a simple compressor API which can be
used to select an algorithm.

Also change the (non-default) GZIO kernel option to not enable
compressed user cores by default. It's not clear that such a default
would be desirable with support for multiple algorithms implemented,
and it's inconsistent in that it isn't applied to kernel dumps.

Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D13632


# 51369649 20-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.


# 64a16434 24-Oct-2017 Mark Johnston <markj@FreeBSD.org>

Add support for compressed kernel dumps.

When using a kernel built with the GZIO config option, dumpon -z can be
used to configure gzip compression using the in-kernel copy of zlib.
This is useful on systems with large amounts of RAM, which require a
correspondingly large dump device. Recovery of compressed dumps is also
faster since fewer bytes need to be copied from the dump device.

Because we have no way of knowing the final size of a compressed dump
until it is written, the kernel will always attempt to dump when
compression is configured, regardless of the dump device size. If the
dump is aborted because we run out of space, an error is reported on
the console.

savecore(8) is modified to handle compressed dumps and save them to
vmcore.<index>.gz, as it does when given the -z option.

A new rc.conf variable, dumpon_flags, is added. Its value is added to
the boot-time dumpon(8) invocation that occurs when a dump device is
configured in rc.conf.

Reviewed by: cem (earlier version)
Discussed with: def, rgrimes
Relnotes: yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11723


# 46fcd1af 18-Oct-2017 Mark Johnston <markj@FreeBSD.org>

Move kernel dump offset tracking into MI code.

All of the kernel dump implementations keep track of the current offset
("dumplo") within the dump device. However, except for textdumps, they
all write the dump sequentially, so we can reduce code duplication by
having the MI code keep track of the current offset. The new
dump_append() API can be used to write at the current offset.

This is needed to implement support for kernel dump compression in the
MI kernel dump code.

Also simplify dump_encrypted_write() somewhat: use dump_write() instead
of duplicating its bounds checks, and get rid of the redundant offset
tracking.

Reviewed by: cem
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11722


# 01938d36 17-Aug-2017 Mark Johnston <markj@FreeBSD.org>

Rename mkdumpheader() and group EKCD functions in kern_shutdown.c.

This helps simplify the code in kern_shutdown.c and reduces the number
of globally visible functions.

No functional change intended.

Reviewed by: cem, def
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11603


# 50ef60da 17-Aug-2017 Mark Johnston <markj@FreeBSD.org>

Factor out duplicated kernel dump code into dump_{start,finish}().

dump_start() and dump_finish() are responsible for writing kernel dump
headers, optionally writing the key when encryption is enabled, and
initializing the initial offset into the dump device.

Also remove the unused dump_pad(), and make some functions static now that
they're only called from kern_shutdown.c.

No functional change intended.

Reviewed by: cem, def
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11584


# 8c1d0d9c 21-Apr-2017 Rick Macklem <rmacklem@FreeBSD.org>

Set default uid/gid to nobody/nogroup for NFSv4 mapping.

The default uid/gid for NFSv4 are set by the nfsuserd(8) daemon.
However, they were 0 until the nfsuserd(8) was run. Since it is
possible to use NFSv4 without running the nfsuserd(8) daemon, set them
to nobody/nogroup initially.
Without this patch, the values would be set by the nfsuserd(8) daemon
and left changed even if the nfsuserd(8) daemon was killed. The default
values of 0 meant that setting a group to "wheel" would fail even when
done by root.
It also adds a definition of GID_NOGROUP to sys/conf.h.

Discussed on: freebsd-current@
MFC after: 2 weeks


# fbbd9655 28-Feb-2017 Warner Losh <imp@FreeBSD.org>

Renumber copyright clause 4

Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96


# 6ca28feb 31-Dec-2016 Konstantin Belousov <kib@FreeBSD.org>

Remove unused declaration.

The setconf() implementation was removed by r52778 Nov 1 1999.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 480f31c2 10-Dec-2016 Konrad Witaszczyk <def@FreeBSD.org>

Add support for encrypted kernel crash dumps.

Changes include modifications in kernel crash dump routines, dumpon(8) and
savecore(8). A new tool called decryptcore(8) was added.

A new DIOCSKERNELDUMP I/O control was added to send a kernel crash dump
configuration in the diocskerneldump_arg structure to the kernel.
The old DIOCSKERNELDUMP I/O control was renamed to DIOCSKERNELDUMP_FREEBSD11 for
backward ABI compatibility.

dumpon(8) generates an one-time random symmetric key and encrypts it using
an RSA public key in capability mode. Currently only AES-256-CBC is supported
but EKCD was designed to implement support for other algorithms in the future.
The public key is chosen using the -k flag. The dumpon rc(8) script can do this
automatically during startup using the dumppubkey rc.conf(5) variable. Once the
keys are calculated dumpon sends them to the kernel via DIOCSKERNELDUMP I/O
control.

When the kernel receives the DIOCSKERNELDUMP I/O control it generates a random
IV and sets up the key schedule for the specified algorithm. Each time the
kernel tries to write a crash dump to the dump device, the IV is replaced by
a SHA-256 hash of the previous value. This is intended to make a possible
differential cryptanalysis harder since it is possible to write multiple crash
dumps without reboot by repeating the following commands:
# sysctl debug.kdb.enter=1
db> call doadump(0)
db> continue
# savecore

A kernel dump key consists of an algorithm identifier, an IV and an encrypted
symmetric key. The kernel dump key size is included in a kernel dump header.
The size is an unsigned 32-bit integer and it is aligned to a block size.
The header structure has 512 bytes to match the block size so it was required to
make a panic string 4 bytes shorter to add a new field to the header structure.
If the kernel dump key size in the header is nonzero it is assumed that the
kernel dump key is placed after the first header on the dump device and the core
dump is encrypted.

Separate functions were implemented to write the kernel dump header and the
kernel dump key as they need to be unencrypted. The dump_write function encrypts
data if the kernel was compiled with the EKCD option. Encrypted kernel textdumps
are not supported due to the way they are constructed which makes it impossible
to use the CBC mode for encryption. It should be also noted that textdumps don't
contain sensitive data by design as a user decides what information should be
dumped.

savecore(8) writes the kernel dump key to a key.# file if its size in the header
is nonzero. # is the number of the current core dump.

decryptcore(8) decrypts the core dump using a private RSA key and the kernel
dump key. This is performed by a child process in capability mode.
If the decryption was not successful the parent process removes a partially
decrypted core dump.

Description on how to encrypt crash dumps was added to the decryptcore(8),
dumpon(8), rc.conf(5) and savecore(8) manual pages.

EKCD was tested on amd64 using bhyve and i386, mipsel and sparc64 using QEMU.
The feature still has to be tested on arm and arm64 as it wasn't possible to run
FreeBSD due to the problems with QEMU emulation and lack of hardware.

Designed by: def, pjd
Reviewed by: cem, oshogbo, pjd
Partial review: delphij, emaste, jhb, kib
Approved by: pjd (mentor)
Differential Revision: https://reviews.freebsd.org/D4712


# e1da986b 01-May-2016 Konstantin Belousov <kib@FreeBSD.org>

Make it explicit that D_MEM cdevsw d_flag is to signify that the
driver is (or behaves identically to) /dev/mem. Remove the D_MEM flag
from random drivers.

Note that currently the D_MEM flag does not affect any behaviour, but
this going to change in the next commit.

Noted and reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
X-Differential revision: https://reviews.freebsd.org/D6149


# 5dc5dab6 15-Apr-2016 Conrad Meyer <cem@FreeBSD.org>

Add 4Kn kernel dump support

(And 4Kn minidump support, but only for amd64.)

Make sure all I/O to the dump device is of the native sector size. To
that end, we keep a native sector sized buffer associated with dump
devices (di->blockbuf) and use it to pad smaller objects as needed (e.g.
kerneldumpheader).

Add dump_write_pad() as a convenience API to dump smaller objects with
zero padding. (Rather than pull in NPM leftpad, we wrote our own.)

Savecore(1) has been updated to deal with these dumps. The format for
512-byte sector dumps should remain backwards compatible.

Minidumps for other architectures are left as an exercise for the
reader.

PR: 194279
Submitted by: ambrisko@
Reviewed by: cem (earlier version), rpokala
Tested by: rpokala (4Kn/512 except 512 fulldump), cem (512 fulldump)
Relnotes: yes
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D5848


# a53b7c69 13-Jan-2016 Konstantin Belousov <kib@FreeBSD.org>

Make devfs_fpdrop() static. It was not a public KPI, and it has no
reason to remain exported for some time.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 48ce5d4c 07-Jan-2016 Konstantin Belousov <kib@FreeBSD.org>

Provide yet another KPI for cdev creation, make_dev_s(9).

Immediate problem fixed by the new KPI is the long-standing race
between device creation and assignments to cdev->si_drv1 and
cdev->si_drv2, which allows the window where cdevsw methods might be
called with si_drv1,2 fields not yet set. Devices typically checked
for NULL and returned spurious errors to usermode, and often left some
methods unchecked.

The new function interface is designed to be extensible, which should
allow to add more features to make_dev_s(9) without inventing yet
another name for function to create devices, while maintaining KPI and
even KBI backward-compatibility.

Reviewed by: hps, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D4746


# 8d7e0f58 02-Dec-2015 John Baldwin <jhb@FreeBSD.org>

The cdevpriv_dtr_t typedef was not able to be used in a function prototype
like the various d_*_t typedefs since it declared a function pointer rather
than a function. Add a new d_priv_dtor_t typedef that declares the function
and can be used as a function prototype. The previous typedef wasn't
useful outside of the cdevpriv implementation, so retire it.

The name d_priv_dtor_t was chosen to be more consistent with cdev methods
since it is commonly used in place of d_close_t even though it is not a
direct pointer in struct cdevsw.

Reviewed by: kib, imp
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D4340


# ad8d57a9 10-Sep-2015 Warner Losh <imp@FreeBSD.org>

dev_strategy and dev_strategy_csw are unused since r281825. Remove
them.

Differential Revision: https://reviews.freebsd.org/D3620


# d8f4b935 11-Aug-2015 Koop Mast <kwm@FreeBSD.org>

Instead of defining the actualy user and group id in the drmP.h files
define GID_VIDEO in sys/conf.h, and use it together with UID_ROOT
to define DRM_DEV_UID and DRM_DEV_GID in the drmP.h files.

So there is one place where the UID's and GID's are defined.

Submitted by: ed@
Reviewed by: ed@, dumbbell@
Differential Revision: https://reviews.freebsd.org/D3360


# 304d0201 29-Jan-2015 John Baldwin <jhb@FreeBSD.org>

Remove the d_thread_t compatibility shim provided in 5.0 to handle the
struct thread (<= 4.x) vs struct proc (>= 5.0) argument to cdevsw routines.
It is long past its expiration date.

PR: 196544 (exp-run)


# 07dbde67 14-Jan-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Add a kernel function to delist our kernel character devices, so that
the device name can be re-used right away in case we are destroying
the character devices in the background.

MFC after: 4 days
Reported by: dchagin@


# bdb9ab0d 06-Jan-2015 Mark Johnston <markj@FreeBSD.org>

Factor out duplicated code from dumpsys() on each architecture into generic
code in sys/kern/kern_dump.c. Most dumpsys() implementations are nearly
identical and simply redefine a number of constants and helper subroutines;
a generic implementation will make it easier to implement features around
kernel core dumps. This change does not alter any minidump code and should
have no functional impact.

PR: 193873
Differential Revision: https://reviews.freebsd.org/D904
Submitted by: Conrad Meyer <conrad.meyer@isilon.com>
Reviewed by: jhibbits (earlier version)
Sponsored by: EMC / Isilon Storage Division


# 5ebb15b9 10-Nov-2014 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Add missing privilege check when setting the dump device. Before that change it
was possible for a regular user to setup the dump device if he had write access
to the given device. In theory it is a security issue as user might get access
to kernel's memory after provoking kernel crash, but in practise it is not
recommended to give regular users direct access to storage devices.

Rework the code so that we do privileges check within the set_dumper() function
to avoid similar problems in the future.

Discussed with: secteam


# 4d1a2fee 07-Nov-2014 Konstantin Belousov <kib@FreeBSD.org>

Add DEV_MODULE_ORDERED().

Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# f6b4f5ca 25-Jul-2014 Gavin Atkinson <gavin@FreeBSD.org>

Add error return to dumpsys(), and use it in doadump().

This commit does not add error returns to minidumpsys() or
textdump_dumpsys(); those can also be added later.

Submitted by: Conrad Meyer (EMC / Isilon storage division)


# 93729c17 23-Aug-2013 Kenneth D. Merry <ken@FreeBSD.org>

Add support to physio(9) for devices that don't want I/O split and
configure sa(4) to request no I/O splitting by default.

For tape devices, the user needs to be able to clearly understand
what blocksize is actually being used when writing to a tape
device. The previous behavior of physio(9) was that it would split
up any I/O that was too large for the device, or too large to fit
into MAXPHYS. This means that if, for instance, the user wrote a
1MB block to a tape device, and MAXPHYS was 128KB, the 1MB write
would be split into 8 128K chunks. This would be done without
informing the user.

This has suboptimal effects, especially when trying to communicate
status to the user. In the event of an error writing to a tape
(e.g. physical end of tape) in the middle of a 1MB block that has
been split into 8 pieces, the user could have the first two 128K
pieces written successfully, the third returned with an error, and
the last 5 returned with 0 bytes written. If the user is using
a standard write(2) system call, all he will see is the ENOSPC
error. He won't have a clue how much actually got written. (With
a writev(2) system call, he should be able to determine how much
got written in addition to the error.)

The solution is to prevent physio(9) from splitting the I/O. The
new cdev flag, SI_NOSPLIT, tells physio that the driver does not
want I/O to be split beforehand.

Although the sa(4) driver now enables SI_NOSPLIT by default,
that can be disabled by two loader tunables for now. It will not
be configurable starting in FreeBSD 11.0. kern.cam.sa.allow_io_split
allows the user to configure I/O splitting for all sa(4) driver
instances. kern.cam.sa.%d.allow_io_split allows the user to
configure I/O splitting for a specific sa(4) instance.

There are also now three sa(4) driver sysctl variables that let the
users see some sa(4) driver values. kern.cam.sa.%d.allow_io_split
shows whether I/O splitting is turned on. kern.cam.sa.%d.maxio shows
the maximum I/O size allowed by kernel configuration parameters
(e.g. MAXPHYS, DFLTPHYS) and the capabilities of the controller.
kern.cam.sa.%d.cpi_maxio shows the maximum I/O size supported by
the controller.

Note that a better long term solution would be to implement support
for chaining buffers, so that that MAXPHYS is no longer a limiting
factor for I/O size to tape and disk devices. At that point, the
controller and the tape drive would become the limiting factors.

sys/conf.h: Add a new cdev flag, SI_NOSPLIT, that allows a
driver to tell physio not to split up I/O.

sys/param.h: Bump __FreeBSD_version to 1000049 for the addition
of the SI_NOSPLIT cdev flag.

kern_physio.c: If the SI_NOSPLIT flag is set on the cdev, return
any I/O that is larger than si_iosize_max or
MAXPHYS, has more than one segment, or would have
to be split because of misalignment with EFBIG.
(File too large).

In the event of an error, print a console message to
give the user a clue about what happened.

scsi_sa.c: Set the SI_NOSPLIT cdev flag on the devices created
for the sa(4) driver by default.

Add tunables to control whether we allow I/O splitting
in physio(9).

Explain in the comments that allowing I/O splitting
will be deprecated for the sa(4) driver in FreeBSD
11.0.

Add sysctl variables to display the maximum I/O
size we can do (which could be further limited by
read block limits) and the maximum I/O size that
the controller can do.

Limit our maximum I/O size (recorded in the cdev's
si_iosize_max) by MAXPHYS. This isn't strictly
necessary, because physio(9) will limit it to
MAXPHYS, but it will provide some clarity for the
application.

Record the controller's maximum I/O size reported
in the Path Inquiry CCB.

sa.4: Document the block size behavior, and explain that
the option of allowing physio(9) to split the I/O
will disappear in FreeBSD 11.0.

Sponsored by: Spectra Logic


# ce625ec7 15-Aug-2013 Kenneth D. Merry <ken@FreeBSD.org>

Change the way that unmapped I/O capability is advertised.

The previous method was to set the D_UNMAPPED_IO flag in the cdevsw
for the driver. The problem with this is that in many cases (e.g.
sa(4)) there may be some instances of the driver that can handle
unmapped I/O and some that can't. The isp(4) driver can handle
unmapped I/O, but the esp(4) driver currently cannot. The cdevsw
is shared among all driver instances.

So instead of setting a flag on the cdevsw, set a flag on the cdev.
This allows drivers to indicate support for unmapped I/O on a
per-instance basis.

sys/conf.h: Remove the D_UNMAPPED_IO cdevsw flag and replace it
with an SI_UNMAPPED cdev flag.

kern_physio.c: Look at the cdev SI_UNMAPPED flag to determine
whether or not a particular driver can handle
unmapped I/O.

geom_dev.c: Set the SI_UNMAPPED flag for all GEOM cdevs.
Since GEOM will create a temporary mapping when
needed, setting SI_UNMAPPED unconditionally will
work.

Remove the D_UNMAPPED_IO flag.

nvme_ns.c: Set the SI_UNMAPPED flag on cdevs created here
if NVME_UNMAPPED_BIO_SUPPORT is enabled.

vfs_aio.c: In aio_qphysio(), check the SI_UNMAPPED flag on a
cdev instead of the D_UNMAPPED_IO flag on the cdevsw.

sys/param.h: Bump __FreeBSD_version to 1000045 for the switch from
setting the D_UNMAPPED_IO flag in the cdevsw to setting
SI_UNMAPPED in the cdev.

Reviewed by: kib, jimharris
MFC after: 1 week
Sponsored by: Spectra Logic


# d1e99f43 27-Mar-2013 Konstantin Belousov <kib@FreeBSD.org>

Add dev_strategy_csw() function, which is similar to dev_strategy()
but assumes that a thread reference was already obtained on the passed
device. Use the function from physio(), to avoid two extra dev_mtx
lock and unlock. Note that physio() is always used as the cdevsw
method, or is called from a cdevsw method, and the caller already owns
the reference.

dev_strategy() is left to keep KPI intact, but now it is implemented
as a wrapper around dev_strategy_csw().

Do some style cleanup in physio().

Requested and reviewed by: kan (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks


# 31932fae 25-Mar-2013 Alexander Kabaev <kan@FreeBSD.org>

Do not pass unmapped buffers to drivers that cannot handle them

In physio, check if device can handle unmapped IO and pass an
appropriately mapped buffer to the driver strategy routine. The
only driver in the tree that can handle unmapped buffers is one
exposed by GEOM, so mark it as such with the new flag in the
driver cdevsw structure.

This fixes insta-panics on hosts, running dconschat, as /dev/fwmem
is an example of the driver that makes use of physio routine, but
bypasses the g_down thread, where the buffer gets mapped normally.

Discussed with: kib (earlier version)


# bad7e7f3 01-Nov-2012 Alfred Perlstein <alfred@FreeBSD.org>

Provide a device name in the sysctl tree for programs to query the
state of crashdump target devices.

This will be used to add a "-l" (ell) flag to dumpon(8) to list the
currently configured dumpdev.

Reviewed by: phk


# fa4dd278 28-Aug-2012 Ed Schouten <ed@FreeBSD.org>

Remove unused SI_* flags.

The SI_DEVOPEN, SI_CONSOPEN and SI_CANDELETE flags are not used by any
piece of code in the tree.


# 1faacf5d 28-Mar-2012 Kirk McKusick <mckusick@FreeBSD.org>

Keep track of the mount point associated with a special device
to enable the collection of counts of synchronous and asynchronous
reads and writes for its associated filesystem. The counts are
displayed using `mount -v'.

Ensure that buffers used for paging indicate the vnode from
which they are operating so that counts of paging I/O operations
from the filesystem are collected.

This checkin only adds the setting of the mount point for the
UFS/FFS filesystem, but it would be trivial to add the setting
and clearing of the mount point at filesystem mount/unmount
time for other filesystems too.

Reviewed by: kib


# 8fac9b7b 09-Feb-2012 Ed Schouten <ed@FreeBSD.org>

Merge si_name and __si_namebuf.

The si_name pointer always points to the __si_namebuf member inside the
same object. Remove it and rename __si_namebuf to si_name.


# a185bd12 18-Oct-2011 Ed Schouten <ed@FreeBSD.org>

Get rid of D_PSEUDO.

It seems the D_PSEUDO flag was meant to allow make_dev() to return NULL.
Nowadays we have a different interface for that; make_dev_p(). There's
no need to keep it there.

While there, remove an unneeded D_NEEDMINOR from the gpio driver.

Discussed with: gonzo@ (gpio)


# a906bdb0 17-Oct-2011 Ed Schouten <ed@FreeBSD.org>

Fix whitespace.


# 084e62e9 05-Oct-2011 Konstantin Belousov <kib@FreeBSD.org>

Export devfs inode number allocator for the kernel consumers.

Reviewed by: jhb
MFC after: 2 weeks


# aa76615d 14-Jun-2011 Justin T. Gibbs <gibbs@FreeBSD.org>

sys/sys/conf.h:
sys/kern/kern_conf.c:
Add make_dev_physpath_alias(). This interface takes
the parent cdev of the alias, an old alias cdev (if any)
to replace with the newly created alias, and the physical
path string. The alias is visiable as a symlink to the
parent, with the same name as the parent, rooted at
physpath in devfs.

Note: make_dev_physpath_alias() has hard coded knowledge of the
Solaris style prefix convention for physical path data,
"id1,". In the future, I expect the convention to change
to allow "physical path quality" to be reported in the
prefix. For example, a physical path based on NewBus
topology would be of "lower quality" than a physical path
reported by a device enclosure.

Sponsored by: Spectra Logic Corporation


# 299cceef 06-Jun-2011 Marcel Moolenaar <marcel@FreeBSD.org>

Fix making kernel dumps from the debugger by creating a command
for it. Do not not expect a developer to call doadump(). Calling
doadump does not necessarily work when it's declared static. Nor
does it necessarily do what was intended in the context of text
dumps. The dump command always creates a core dump.

Move printing of error messages from doadump to the dump command,
now that we don't have to worry about being called from DDB.


# b50a7799 03-May-2011 Andrey V. Elsukov <ae@FreeBSD.org>

Add make_dev_alias_p() function. It is similar to make_dev_alias(),
but it may return an error like make_dev_p() does.

Reviewed by: kib (previous version)
MFC after: 2 weeks


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 68f7a013 07-Oct-2010 Jaakko Heinonen <jh@FreeBSD.org>

Check the device name validity on device registration.

A new function prep_devname() sanitizes a device name by removing
leading and redundant sequential slashes. The function returns an error
for names which already exist or are considered invalid.

A new flag MAKEDEV_CHECKNAME for make_dev_p(9) and make_dev_credf(9)
indicates that the caller is prepared to handle an error related to the
device name. An invalid name triggers a panic if the flag is not
specified.

Document the MAKEDEV_CHECKNAME flag in the make_dev(9) manual page.

Idea from: kib
Reviewed by: kib


# 3979450b 06-Aug-2010 Konstantin Belousov <kib@FreeBSD.org>

Add new make_dev_p(9) flag MAKEDEV_ETERNAL to inform devfs that created
cdev will never be destroyed. Propagate the flag to devfs vnodes as
VV_ETERNVALDEV. Use the flags to avoid acquiring devmtx and taking a
thread reference on such nodes.

In collaboration with: pho
MFC after: 1 month


# 2e983ace 17-Jun-2010 Ed Schouten <ed@FreeBSD.org>

Remove the unit argument from the recently added make_dev_p().

New code that creates character devices shouldn't use device unit
numbers, but only si_drv[12] to hold pointer to per-device data. Make
this function more future proof by removing the unit number argument.

Discussed with: kib


# f1bb758d 12-Jun-2010 Konstantin Belousov <kib@FreeBSD.org>

Add another variation of make_dev(9), make_dev_p(9), that is allowed
to fail and can return useful error code.

Requested by: jh
Reviewed by: imp, jh
MFC after: 3 weeks


# c67c645d 20-May-2010 Konstantin Belousov <kib@FreeBSD.org>

MFC r207729:
Add MAKEDEV_NOWAIT flag for make_dev_credf(9).


# d2ba618a 06-May-2010 Konstantin Belousov <kib@FreeBSD.org>

Add MAKEDEV_NOWAIT flag to make_dev_credf(9), to create a device node
in a no-sleep context. If resource allocation cannot be done without
sleep, make_dev_credf() fails and returns NULL.

Reviewed by: jh
MFC after: 2 weeks


# cfd7bace 29-Dec-2009 Robert Noland <rnoland@FreeBSD.org>

Update d_mmap() to accept vm_ooffset_t and vm_memattr_t.

This replaces d_mmap() with the d_mmap2() implementation and also
changes the type of offset to vm_ooffset_t.

Purge d_mmap2().

All driver modules will need to be rebuilt since D_VERSION is also
bumped.

Reviewed by: jhb@
MFC after: Not in this lifetime...


# 2d2a89dd 31-Oct-2009 Ed Schouten <ed@FreeBSD.org>

Turn unused structure fields of cdevsw into spares.

d_uid, d_gid and d_mode are unused, because permissions are stored in
cdevpriv nowadays. d_kind doesn't seem to be used at all. We no longer
keep a list of cdevsw's, so d_list is also unused.

uid_t and gid_t are 32 bits, but mode_t is 16 bits, Because of alignment
constraints of d_kind, we can safely turn it into three 32-bit integers.
d_kind and d_list is equal in size to three pointers.

Discussed with: kib


# a6fb7268 29-Oct-2009 John Baldwin <jhb@FreeBSD.org>

MFC 196615:
Extend the device pager to support different memory attributes on different
pages in an object.
- Add a new variant of d_mmap() currently called d_mmap2() which accepts
an additional in/out parameter that is the memory attribute to use for
the requested page.
- A driver either uses d_mmap() or d_mmap2() for all requests but not both.
The current implementation uses a flag in the cdevsw (D_MMAP2) to indicate
that the driver provides a d_mmap2() handler instead of d_mmap(). This
is done to make the change ABI compatible with existing drivers and
MFC'able to 7 and 8.


# 2fa8c8d2 28-Aug-2009 John Baldwin <jhb@FreeBSD.org>

Extend the device pager to support different memory attributes on different
pages in an object.
- Add a new variant of d_mmap() currently called d_mmap2() which accepts
an additional in/out parameter that is the memory attribute to use for
the requested page.
- A driver either uses d_mmap() or d_mmap2() for all requests but not both.
The current implementation uses a flag in the cdevsw (D_MMAP2) to indicate
that the driver provides a d_mmap2() handler instead of d_mmap(). This
is done to make the change ABI compatible with existing drivers and
MFC'able to 7 and 8.

Submitted by: alc
MFC after: 1 month


# 2654ae4d 25-Jun-2009 John Baldwin <jhb@FreeBSD.org>

Remove the d_spare2_t typedef. The d_spare2 field was replaced by
d_mmap_single(). I considered adding a new round of padding for 8.0.
However, since cdevsw already maintains a version field, new versions
can be handled without requiring the need for explicit padding fields.


# 64345f0b 01-Jun-2009 John Baldwin <jhb@FreeBSD.org>

Add an extension to the character device interface that allows character
device drivers to use arbitrary VM objects to satisfy individual mmap()
requests.
- A new d_mmap_single(cdev, &foff, objsize, &object, prot) callback is
added to cdevsw. This function is called for each mmap() request.
If it returns ENODEV, then the mmap() request will fall back to using
the device's device pager object and d_mmap(). Otherwise, the method
can return a VM object to satisfy this entire mmap() request via
*object. It can also modify the starting offset into this object via
*foff. This allows device drivers to use the file offset as a cookie
to identify specific VM objects.
- vm_mmap_vnode() has been changed to call vm_mmap_cdev() directly when
mapping V_CHR vnodes. This avoids duplicating all the cdev mmap
handling code and simplifies some of vm_mmap_vnode().
- D_VERSION has been bumped to D_VERSION_02. Older device drivers
using D_VERSION_01 are still supported.

MFC after: 1 month


# c3661cd0 15-Feb-2009 Warner Losh <imp@FreeBSD.org>

Make dumper_t definition conform more closely to stlye(9). This also
avoid keywords in other languages that may have been present before.

Submitted by: Andriy Gapon, jkoshy@


# 3231dedb 03-Feb-2009 Ed Schouten <ed@FreeBSD.org>

Remove NUMCDEVSW, which is unused since RELENG_5.

Discussed with: kib


# a4611ab6 28-Jan-2009 Ed Schouten <ed@FreeBSD.org>

Last step of splitting up minor and unit numbers: remove minor().

Inside the kernel, the minor() function was responsible for obtaining
the device minor number of a character device. Because we made device
numbers dynamically allocated and independent of the unit number passed
to make_dev() a long time ago, it was actually a misnomer. If you really
want to obtain the device number, you should use dev2udev().

We already converted all the drivers to use dev2unit() to obtain the
device unit number, which is still used by a lot of drivers. I've
noticed not a single driver passes NULL to dev2unit(). Even if they
would, its behaviour would make little sense. This is why I've removed
the NULL check.

Ths commit removes minor(), minor2unit() and unit2minor() from the
kernel. Because there was a naming collision with uminor(), we can
rename umajor() and uminor() back to major() and minor(). This means
that the makedev(3) manual page also applies to kernel space code now.

I suspect umajor() and uminor() isn't used that often in external code,
but to make it easier for other parties to port their code, I've
increased __FreeBSD_version to 800062.


# 1ba4a712 17-Nov-2008 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes.

This bring huge amount of changes, I'll enumerate only user-visible changes:

- Delegated Administration

Allows regular users to perform ZFS operations, like file system
creation, snapshot creation, etc.

- L2ARC

Level 2 cache for ZFS - allows to use additional disks for cache.
Huge performance improvements mostly for random read of mostly
static content.

- slog

Allow to use additional disks for ZFS Intent Log to speed up
operations like fsync(2).

- vfs.zfs.super_owner

Allows regular users to perform privileged operations on files stored
on ZFS file systems owned by him. Very careful with this one.

- chflags(2)

Not all the flags are supported. This still needs work.

- ZFSBoot

Support to boot off of ZFS pool. Not finished, AFAIK.

Submitted by: dfr

- Snapshot properties

- New failure modes

Before if write requested failed, system paniced. Now one
can select from one of three failure modes:
- panic - panic on write error
- wait - wait for disk to reappear
- continue - serve read requests if possible, block write requests

- Refquota, refreservation properties

Just quota and reservation properties, but don't count space consumed
by children file systems, clones and snapshots.

- Sparse volumes

ZVOLs that don't reserve space in the pool.

- External attributes

Compatible with extattr(2).

- NFSv4-ACLs

Not sure about the status, might not be complete yet.

Submitted by: trasz

- Creation-time properties

- Regression tests for zpool(8) command.

Obtained from: OpenSolaris


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# e1e93f9f 01-Oct-2008 Ed Schouten <ed@FreeBSD.org>

Remove function prototypes of nonexistent TTY functions.

It turns out I overlooked some function prototypes that were actually
TTY related, but were stored in <sys/conf.h> to implement the D_TTY
flag. Remove these prototypes now that they don't exist anymore.


# edde8745 26-Sep-2008 Ed Schouten <ed@FreeBSD.org>

Rename the `minor' argument of make_dev(9) to `unit'.

To prevent any further confusion about device minor and unit numbers,
we'd better just refer to device unit numbers. Many people still think
the numbers we show inside devfs have any relation to the numbers passed
to make_dev(9), which is not the case.

Discussed with: kib


# bc093719 20-Aug-2008 Ed Schouten <ed@FreeBSD.org>

Integrate the new MPSAFE TTY layer to the FreeBSD operating system.

The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:

- Improved driver model:

The old TTY layer has a driver model that is not abstract enough to
make it friendly to use. A good example is the output path, where the
device drivers directly access the output buffers. This means that an
in-kernel PPP implementation must always convert network buffers into
TTY buffers.

If a PPP implementation would be built on top of the new TTY layer
(still needs a hooks layer, though), it would allow the PPP
implementation to directly hand the data to the TTY driver.

- Improved hotplugging:

With the old TTY layer, it isn't entirely safe to destroy TTY's from
the system. This implementation has a two-step destructing design,
where the driver first abandons the TTY. After all threads have left
the TTY, the TTY layer calls a routine in the driver, which can be
used to free resources (unit numbers, etc).

The pts(4) driver also implements this feature, which means
posix_openpt() will now return PTY's that are created on the fly.

- Improved performance:

One of the major improvements is the per-TTY mutex, which is expected
to improve scalability when compared to the old Giant locking.
Another change is the unbuffered copying to userspace, which is both
used on TTY device nodes and PTY masters.

Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.

Obtained from: //depot/projects/mpsafetty/...
Approved by: philip (ex-mentor)
Discussed: on the lists, at BSDCan, at the DevSummit
Sponsored by: Snow B.V., the Netherlands
dcons(4) fixed by: kan


# 05427aaf 16-Jun-2008 Konstantin Belousov <kib@FreeBSD.org>

Struct cdev is always the member of the struct cdev_priv. When devfs
needed to promote cdev to cdev_priv, the si_priv pointer was followed.

Use member2struct() to calculate address of the wrapping cdev_priv.
Rename si_priv to __si_reserved.

Tested by: pho
Reviewed by: ed
MFC after: 2 weeks


# 0f03ce1b 12-Jun-2008 Ed Schouten <ed@FreeBSD.org>

Turn dev2unit(), minor(), unit2minor() and minor2unit() into macro's.

Now that we got rid of the minor-to-unit conversion and the constraints
on device minor numbers, we can convert the functions that operate on
minor and unit numbers to simple macro's. The unit2minor() and
minor2unit() macro's are now no-ops.

The ZFS code als defined a macro named `minor'. Change the ZFS code to
use umajor() and uminor() here, as it is the correct approach to do
this. Also add $FreeBSD$ to keep SVN happy.

Approved by: philip (mentor), pjd


# 29d4cb24 11-Jun-2008 Ed Schouten <ed@FreeBSD.org>

Don't enforce unique device minor number policy anymore.

Except for the case where we use the cloner library (clone_create() and
friends), there is no reason to enforce a unique device minor number
policy. There are various drivers in the source tree that allocate unr
pools and such to provide minor numbers, without using them themselves.

Because we still need to support unique device minor numbers for the
cloner library, introduce a new flag called D_NEEDMINOR. All cdevsw's
that are used in combination with the cloner library should be marked
with this flag to make the cloning work.

This means drivers can now freely use si_drv0 to store their own flags
and state, making it effectively the same as si_drv1 and si_drv2. We
still keep the minor() and dev2unit() routines around to make drivers
happy.

The NTFS code also used the minor number in its hash table. We should
not do this anymore. If the si_drv0 field would be changed, it would no
longer end up in the same list.

Approved by: philip (mentor)


# 06d425f9 28-May-2008 Ed Schouten <ed@FreeBSD.org>

Remove the distinction between device minor and unit numbers.

Even though we got rid of device major numbers some time ago, device
drivers still need to provide unique device minor numbers to make_dev().
These numbers are only used inside the kernel. They are not related to
device major and minor numbers which are visible in devfs. These are
actually based on the inode number of the device.

It would eventually be nice to remove minor numbers entirely, but we
don't want to be too agressive here.

Because the 8-15 bits of the device number field (si_drv0) are still
reserved for the major number, there is no 1:1 mapping of the device
minor and unit numbers. Because this is now unused, remove the
restrictions on these numbers.

The MAXMAJOR definition was actually used for two purposes. It was used
to convert both the userspace and kernelspace device numbers to their
major/minor pair, which is why it is now named UMINORMASK.

minor2unit() and unit2minor() have now become useless. Both minor() and
dev2unit() now serve the same purpose. We should eventually remove some
of them, at least turning them into macro's. If devfs would become
completely minor number unaware, we could consider using si_drv0 directly,
just like si_drv1 and si_drv2.

Approved by: philip (mentor)


# 55eff770 22-May-2008 Ed Schouten <ed@FreeBSD.org>

Add a new group definition to sys/conf.h: GID_TTY.

Our current TTY layer uses a set-uid application called ptchown to
change ownership of a PTY slave device. The new TTY layer implements
this functionality through a new ioctl().

By accident I discovered Darwin's TTY layer also uses this approach.
Because of this, they also have a GID_TTY.

Approved by: philip (mentor)


# 82f4d640 21-May-2008 Konstantin Belousov <kib@FreeBSD.org>

Implement the per-open file data for the cdev.

The patch does not change the cdevsw KBI. Management of the data is
provided by the functions
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
int devfs_get_cdevpriv(void **datap);
void devfs_clear_cdevpriv(void);
All of the functions are supposed to be called from the cdevsw method
contexts.

- devfs_set_cdevpriv assigns the priv as private data for the file
descriptor which is used to initiate currently performed driver
operation. dtr is the function that will be called when either the
last refernce to the file goes away, the device is destroyed or
devfs_clear_cdevpriv is called.
- devfs_get_cdevpriv is the obvious accessor.
- devfs_clear_cdevpriv allows to clear the private data for the still
open file.

Implementation keeps the driver-supplied pointers in the struct
cdev_privdata, that is referenced both from the struct file and struct
cdev, and cannot outlive any of the referee.

Man pages will be provided after the KPI stabilizes.

Reviewed by: jhb
Useful suggestions from: jeff, antoine
Debugging help and tested by: pho
MFC after: 1 month


# aeeb4202 17-Mar-2008 Konstantin Belousov <kib@FreeBSD.org>

Fix two races in the handling of the d_gianttrick for the D_NEEDGIANT
drivers.

In the giant_XXX wrappers for the device methods of the D_NEEDGIANT
drivers, do not dereference the cdev->si_devsw. It is racing with
the destroy_devl() clearing of the si_devsw. Instead, use the
dev_refthread() and return ENXIO for the destroyed device. [1]

The check for the D_INIT in the prep_cdevsw() was not synchronized with
the call of the fini_cdevsw() in destroy_devl(), that under rapid device
creation/destruction may result in the use of uninitialized cdevsw [2].
Change the protocol for the prep_cdevsw(), requiring it to be called
under dev_mtx, where the check for D_INIT is done.

Do not free the memory allocated for the gianttrick cdevsw while holding
the dev_mtx, put it into the free list to be freed later. Reuse the
d_gianttrick pointer to keep the size and layout of the struct cdevsw
(requested by phk). Free the memory in the dev_unlock_and_free(), and do
all the free after the dev_mtx is dropped (suggested by jhb).

Reported by: bsdimp + many [1], pho [2]
Reviewed by: phk, jhb
Tested by: pho
MFC after: 1 week


# 7bbd40c5 14-Feb-2008 Scott Long <scottl@FreeBSD.org>

Teach the dump and minidump code to respect the maxioszie attribute of
the disk; the hard-coded assumption of 64K doesn't work in all cases.


# 007b1b7b 28-Jan-2008 Ruslan Ermilov <ru@FreeBSD.org>

Add a wrapper function that bound checks writes to the dump device.


# de10ffa5 03-Jul-2007 Konstantin Belousov <kib@FreeBSD.org>

Since rev. 1.199 of sys/kern/kern_conf.c, the thread that calls
destroy_dev() from d_close() cdev method would self-deadlock.
devfs_close() bump device thread reference counter, and destroy_dev()
sleeps, waiting for si_threadcount to reach zero for cdev without
d_purge method.

destroy_dev_sched() could be used instead from d_close(), to
schedule execution of destroy_dev() in another context. The
destroy_dev_sched_drain() function can be used to drain the scheduled
calls to destroy_dev_sched(). Similarly, drain_dev_clone_events() drains
the events clone to make sure no lingering devices are left after
dev_clone event handler deregistered.

make_dev_credf(MAKEDEV_REF) function should be used from dev_clone
event handlers instead of make_dev()/make_dev_cred() to ensure that created
device has reference counter bumped before cdev mutex is dropped inside
make_dev().

Reviewed by: tegge (early versions), njl (programming interface)
Debugging help and testing by: Peter Holm
Approved by: re (kensmith)


# 9e223287 31-May-2007 Konstantin Belousov <kib@FreeBSD.org>

Revert UF_OPENING workaround for CURRENT.
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation
argument from being file descriptor index into the pointer to struct file.

Proposed and reviewed by: jhb
Reviewed by: daichi (unionfs)
Approved by: re (kensmith)


# 217f71d8 02-Feb-2007 Bruce M Simpson <bms@FreeBSD.org>

Use int instead of u_int for the 'extra' argument to the
clone_create() KPI.
This fixes a signedness bug in unit number comparisons.

Submitted by: imp, Landon Fuller
PR: kern/105228
MFC after: 2 weeks


# 1663075c 20-Oct-2006 Konstantin Belousov <kib@FreeBSD.org>

Fix the race between devfs_fp_check and devfs_reclaim. Derefence the
vnode' v_rdev and increment the dev threadcount , as well as clear it
(in devfs_reclaim) under the dev_lock().

Reviewed by: tegge
Approved by: pjd (mentor)


# b0d081a0 12-May-2006 John-Mark Gurney <jmg@FreeBSD.org>

drop D_MEMDISK, not used in the tree...


# e606a3c6 19-Sep-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Rewamp DEVFS internals pretty severely [1].

Give DEVFS a proper inode called struct cdev_priv. It is important
to keep in mind that this "inode" is shared between all DEVFS
mountpoints, therefore it is protected by the global device mutex.

Link the cdev_priv's into a list, protected by the global device
mutex. Keep track of each cdev_priv's state with a flag bit and
of references from mountpoints with a dedicated usecount.

Reap the benefits of much improved kernel memory allocator and the
generally better defined device driver APIs to get rid of the tables
of pointers + serial numbers, their overflow tables, the atomics
to muck about in them and all the trouble that resulted in.

This makes RAM the only limit on how many devices we can have.

The cdev_priv is actually a super struct containing the normal cdev
as the "public" part, and therefore allocation and freeing has moved
to devfs_devs.c from kern_conf.c.

The overall responsibility is (to be) split such that kern/kern_conf.c
is the stuff that deals with drivers and struct cdev and fs/devfs
handles filesystems and struct cdev_priv and their private liason
exposed only in devfs_int.h.

Move the inode number from cdev to cdev_priv and allocate inode
numbers properly with unr. Local dirents in the mountpoints
(directories, symlinks) allocate inodes from the same pool to
guarantee against overlaps.

Various other fields are going to migrate from cdev to cdev_priv
in the future in order to hide them. A few fields may migrate
from devfs_dirent to cdev_priv as well.

Protect the DEVFS mountpoint with an sx lock instead of lockmgr,
this lock also protects the directory tree of the mountpoint.

Give each mountpoint a unique integer index, allocated with unr.
Use it into an array of devfs_dirent pointers in each cdev_priv.
Initially the array points to a single element also inside cdev_priv,
but as more devfs instances are mounted, the array is extended with
malloc(9) as necessary when the filesystem populates its directory
tree.

Retire the cdev alias lists, the cdev_priv now know about all the
relevant devfs_dirents (and their vnodes) and devfs_revoke() will
pick them up from there. We still spelunk into other mountpoints
and fondle their data without 100% good locking. It may make better
sense to vector the revoke event into the tty code and there do a
destroy_dev/make_dev on the tty's devices, but that's for further
study.

Lots of shuffling of stuff and churn of bits for no good reason[2].

XXX: There is still nothing preventing the dev_clone EVENTHANDLER
from being invoked at the same time in two devfs mountpoints. It
is not obvious what the best course of action is here.

XXX: comment out an if statement that lost its body, until I can
find out what should go there so it doesn't do damage in the meantime.

XXX: Leave in a few extra malloc types and KASSERTS to help track
down any remaining issues.

Much testing provided by: Kris
Much confusion caused by (races in): md(4)

[1] You are not supposed to understand anything past this point.

[2] This line should simplify life for the peanut gallery.


# 74f46f19 15-Sep-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Retire unused dev_named() function.


# 516ad423 17-Aug-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Handle device drivers with D_NEEDGIANT in a way which does not
penalize the 'good' drivers: Allocate a shadow cdevsw and populate
it with wrapper functions which grab Giant


# 9c0af131 16-Aug-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Create a new internal .h file to communicate very private stuff
from kern_conf.c to devfs.

For now just two prototypes, more to come.


# 6a113b3d 08-Aug-2005 Robert Watson <rwatson@FreeBSD.org>

Merge the dev_clone and dev_clone_cred event handlers into a single
event handler, dev_clone, which accepts a credential argument.
Implementors of the event can ignore it if they're not interested,
and most do. This avoids having multiple event handler types and
fall-back/precedence logic in devfs.

This changes the kernel API for /dev cloning, and may affect third
party packages containg cloning kernel modules.

Requested by: phk
MFC after: 3 days


# d26dd2d9 14-Jul-2005 Robert Watson <rwatson@FreeBSD.org>

When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.

- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.

- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().

- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.

- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.

- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.

While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.

Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
MFC after: 1 week
MFC note: Merge to 6.x, but not 5.x for ABI reasons


# 9477d73e 31-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

cdev (still) needs per instance uid/gid/mode

Add unlocked version of dev_ref()

Clean up various stuff in sys/conf.h


# eb151cb9 30-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Rename dev_ref() to dev_refl()


# ff7284ee 29-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Remove the global cdev hash and use the cdevsw list instead.

Don't remove the now unused element from cdev yet, wait until
we have a better reason to bump the version.


# 60c5621a 18-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

makebdev() is long gone.


# bde1a9c9 17-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Kill MAJOR_AUTO


# 800b42bd 16-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Prepare for the final onslaught on devices:

Move uid/gid/mode from cdev to cdevsw.

Add kind field to use for devd(8) later.

Bump both D_VERSION and __FreeBSD_version


# d39ae27c 14-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Polish.


# 0a2e49f1 15-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Rename cdev->si_udev to cdev->si_drv0 to reflect the new nature of
the field.


# 8cca74f2 15-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Clean up forward struct decls.


# 0f64ffc0 15-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Move devtoname() prototype to systm.h to reduce #include pollution,
it is (or should be) used in many printf() calls.


# 3238ec33 08-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Fix signedness of minor2unit().


# aa2f6ddc 22-Feb-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Reap more benefits from DEVFS:

List devfs_dirents rather than vnodes off their shared struct cdev, this
saves a pointer field in the vnode at the expense of a field in the
devfs_dirent. There are often 100 times more vnodes so this is bargain.
In addition it makes it harder for people to try to do stypid things like
"finding the vnode from cdev".

Since DEVFS handles all VCHR nodes now, we can do the vnode related
cleanup in devfs_reclaim() instead of in dev_rel() and vgonel().
Similarly, we can do the struct cdev related cleanup in dev_rel()
instead of devfs_reclaim().

rename idestroy_dev() to destroy_devl() for consistency.

Add LIST_ENTRY de_alias to struct devfs_dirent.
Remove v_specnext from struct vnode.
Change si_hlist to si_alist in struct cdev.
String new devfs vnodes' devfs_dirent on si_alist when
we create them and take them off in devfs_reclaim().

Fix devfs_revoke() accordingly. Also don't clear fields
devfs_reclaim() will clear when called from vgone();

Let devfs_reclaim() call dev_rel() instead of vgonel().

Move the usecount tracking from dev_rel() to devfs_reclaim(),
and let dev_rel() take a struct cdev argument instead of vnode.

Destroy SI_CHEAPCLONE devices in dev_rel() (instead of
devfs_reclaim()) when they are no longer used. (This
should maybe happen in devfs_close() instead.)


# 3a85fd26 29-Jan-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Add MAXMINOR #define, we should have had this long time ago.

Add minor2unit() in addition to dev2unit() and unit2minor().

If it wasn't such a hazzle we should redefine minor numbers in
the kernel without the gap for the major number, but it's not worth
the bother (yet).


# e207b52a 09-Nov-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Make getdiskbyname() static to vfs_mount.c.

Eliminate use of vn_todev() while here.


# 2723ec18 29-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove si_mountpoint and si_bsize_phys from cdev.


# 6afb3b1c 29-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Give dev_strategy() an explict cdev argument in preparation for removing
buf->b-dev.

Put a bio between the buf passed to dev_strategy() and the device driver
strategy routine in order to not clobber fields in the buf.

Assert copyright on vfs_bio.c and update copyright message to canonical
text. There is no legal difference between John Dysons two-clause
abbreviated BSD license and the canonical text.


# 9b7cc97f 26-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unused si_bsize_best field from struct cdev.


# fae974f1 26-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Degeneralize the per cdev copyonwrite callback. The only possible value
is ffs_copyonwrite() and the only place it can be called from is FFS which
would never want to call another filesystems copyonwrite method, should one
exist, so there is no reason why anything generic should know about this.


# 1a1b2800 25-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Get rid of the magic "stash" of cdev structures, we no longer call
make_dev() before malloc works.


# f8fe7a73 25-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Retire si_stripesize and si_stripeoffset they will not be needed in cdev
in the future.


# 2bdd5609 11-Oct-2004 Peter Wemm <peter@FreeBSD.org>

Belatedly catch up with the dev_t/cdev changes from a few months back.
Extract the struct cdev pointer and the tty device from inside rather than
incorrectly casting the 'struct cdev *' pointer to a 'dev_t' int. Not
that this was particularly important since it was only used for reading
vmcore files.


# ba285125 01-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Fix a LOR relating to freeing cdevs.


# 743cd76a 27-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Add cdevsw->d_purge() support.

This device method shall wake up any threads sleeping in the device driver
and make the depart the drivers code for good.


# b2deb1d2 24-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove the cdevsw() function which is now unused.


# af8b1978 24-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove SI_ISDISK, I found a better solution.


# 2c15afd8 23-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Introduce dev_re[lf]thread() functions.

dev_refthread() will return the cdevsw pointer or NULL. If the
return value is non-NULL a threadcount is held which much be released
with dev_relthread(). If the returned cdevsw is NULL no threadcount
is held on the device.


# 1a52a73d 23-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Eliminate DEV_STRATEGY() macro: call dev_strategy() directly.

Make dev_strategy() handle errors and departing devices properly.


# a0e78d2e 23-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Do not refcount the cdevsw, but rather maintain a cdev->si_threadcount
of the number of threads which are inside whatever is behind the
cdevsw for this particular cdev.

Make the device mutex visible through dev_lock() and dev_unlock().
We may want finer granularity later.

Replace spechash_mtx use with dev_lock()/dev_unlock().


# 4eb84db6 17-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Add various stuff to struct tty and surounding areas in preparation
for getting stuff from P4::phk_tty into -current.


# 67673e66 13-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Create struct snapdata which contains the snapshot fields from cdev
and the previously malloc'ed snapshot lock.

Malloc struct snapdata instead of just the lock.

Replace snapshot fields in cdev with pointer to snapdata (saves 16 bytes).

While here, give the private readblock() function a vnode argument
in preparation for moving UFS to access GEOM directly.


# 083d7d1c 11-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Add a threadcount field which we will need later for device removal
cleanup. Adding it now and MT5'ing will preserve binary compatibility
if this code is later MFC'ed.

MT5 candidate.


# 5263eebc 10-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Add two spare elements for planned but not yet implemented stuff related
to device driver unloading. Adding these two now and MT5'ing them will
allow us to preserve binary compatibility on RELENG_5 when these facilities
are MFC'ed.

MT5 Candiate.


# 2e305620 11-Jul-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove the last bits of SPECHASH.


# f3732fd1 17-Jun-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Second half of the dev_t cleanup.

The big lines are:
NODEV -> NULL
NOUDEV -> NODEV
udev_t -> dev_t
udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.


# 89c9c53d 16-Jun-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.


# a64d4b26 04-Jun-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Move the line discipline related stuff out of <sys/conf.h> and into
<sys/linedisc.h> (repocopied).

Temporarily use a nested include from <sys/tty.h> to get <sys/linedisc.h>
into relevant source files.

Introduce a set of inline functions named ttyld_...() to invoke
linedisc methods instead of groping around in the linesw array.


# 3a95025f 01-Jun-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Introduce a ttyioctl() cdevsw default function.


# 847dc1fe 01-Jun-2004 Poul-Henning Kamp <phk@FreeBSD.org>

shift the four cdevsw functions for ttys to sys/conf.h and prototype
them with the correct typedef.


# 82c6e879 06-Apr-2004 Warner Losh <imp@FreeBSD.org>

Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core


# 9397290e 10-Mar-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Add clone_setup() function rather than rely on lazy initialization.

Requested by: rwatson


# cd690b60 21-Feb-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Device megapatch 6/6:

This is what we came here for: Hang dev_t's from their cdevsw,
refcount cdevsw and dev_t and generally keep track of things a lot
better than we used to:

Hold a cdevsw reference around all entrances into the device driver,
this will be necessary to safely determine when we can unload driver
code.

Hold a dev_t reference while the device is open.

KASSERT that we do not enter the driver on a non-referenced dev_t.

Remove old D_NAG code, anonymous dev_t's are not a problem now.

When destroy_dev() is called on a referenced dev_t, move it to
dead_cdevsw's list. When the refcount drops, free it.

Check that cdevsw->d_version is correct. If not, set all methods
to the dead_*() methods to prevent entrance into driver. Print
warning on console to this effect. The device driver may still
explode if it is also incompatible with newbus, but in that case
we probably didn't get this far in the first place.


# dc08ffec 21-Feb-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Device megapatch 4/6:

Introduce d_version field in struct cdevsw, this must always be
initialized to D_VERSION.

Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing
four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.


# b0b03348 21-Feb-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Device megapatch 2/6:

This commit adds a couple of functions for pseudodrivers to use for
implementing cloning in a manner we will be able to lock down (shortly).

Basically what happens is that pseudo drivers get a way to ask for
"give me the dev_t with this unit number" or alternatively "give
me a dev_t with the lowest guaranteed free unit number" (there is
unfortunately a lot of non-POLA in the exact numeric value of this
number, just live with it for now)

Managing the unit number space this way removes the need to use
rman(9) to do so in the drivers this greatly simplifies the code in
the drivers because even using rman(9) they still needed to manage
their dev_t's anyway.

I have taken the if_tun, if_tap, snp and nmdm drivers through the
mill, partly because they (ab)used makedev(), but mostly because
together they represent three different problems for device-cloning:

if_tun and snp is the plain case: just give me a device.

if_tap has two kinds of devices, with a flag for device type.

nmdm has paired devices (ala pty) can you can clone either of them.


# d986d458 18-Oct-2003 Poul-Henning Kamp <phk@FreeBSD.org>

The size and contents of the DEV_STRATEGY() macro has progressed to
the point where it being a macro is no longer sensible, and it will
only be more so in days to come.

BIO_STRATEGY() is now only used from DEV_STRATEGY() and should not
be used directly anymore.

Put the contents of both in the new function dev_strategy() and
make DEV_STRATEGY() call that function.

In addition, this allows us to make the rather magic bufdonebio()
helper function static.

This alse saves hunderedandsome bytes of code in a typical kernel.


# 0023f618 15-Oct-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Introduce a new optional memberfunction for cdevsw, fdopen() which
passes the fdidx from VOP_OPEN down.

This is for all I know the final API for this functionality, but
the locking semantics for messing with the filedescriptor from
the device driver are not settled at this time.


# 43102178 28-Sep-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Retire revoke_and_destroy_dev() with extreme prejudice.


# 370d48eb 27-Sep-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Add an explanation why MAJOR_AUTO should not be specified explicitly.


# b2941431 26-Sep-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Introduce no_poll() default method for device drivers. Have it
do exactly the same as vop_nopoll() for consistency and put a
comment in the two pointing at each other.

Retire seltrue() in favour of no_poll().

Create private default functions in kern_conf.c instead of public
ones.

Change default strategy to return the bio with ENODEV instead of
doing nothing which would lead the bio stranded.

Retire public nullopen() and nullclose() as well as the entire band
of public no{read,write,ioctl,mmap,kqfilter,strategy,poll,dump}
funtions, they are the default actions now.

Move the final two trivial functions from subr_xxx.c to kern_conf.c
and retire the now empty subr_xxx.c


# 04825a77 26-Sep-2003 Poul-Henning Kamp <phk@FreeBSD.org>

noopen() and noclose() is now no longer used.


# 4e5f0687 26-Sep-2003 Poul-Henning Kamp <phk@FreeBSD.org>

nopsize is no longer used.


# 227f9a1c 24-Mar-2003 Jake Burkholder <jake@FreeBSD.org>

- Add vm_paddr_t, a physical address type. This is required for systems
where physical addresses larger than virtual addresses, such as i386s
with PAE.
- Use this to represent physical addresses in the MI vm system and in the
i386 pmap code. This also changes the paddr parameter to d_mmap_t.
- Fix printf formats to handle physical addresses >4G in the i386 memory
detection code, and due to kvtop returning vm_paddr_t instead of u_long.

Note that this is a name change only; vm_paddr_t is still the same as
vm_offset_t on all currently supported platforms.

Sponsored by: DARPA, Network Associates Laboratories
Discussed with: re, phk (cdevsw change)


# 441931b5 09-Mar-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Define MAJOR_AUTO as zero, which means that leaving out an initialization
of d_maj means "allocate major number automatically".

Keep the definition of MAJOR_AUTO to make life easier for cross-branch
source maintainers.


# ffaae05d 03-Mar-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Rearrange the members of struct cdevsw to be absolutely sure to catch
any initializations which are not done right.


# 182a9f74 03-Mar-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Make nokqfilter() return the correct return value.

Ditch the D_KQFILTER flag which was used to prevent calling NULL pointers.


# b1a89575 02-Mar-2003 Poul-Henning Kamp <phk@FreeBSD.org>

NO_GEOM cleanup:

Remove (actually: Obscurely rename) cdevsw->d_psize() to prevent future use.


# 9285a87e 02-Mar-2003 Poul-Henning Kamp <phk@FreeBSD.org>

NODEVFS cleanup:

Replace devfs_{create,destroy} hooks with direct function calls.


# 98bbd7aa 28-Feb-2003 Poul-Henning Kamp <phk@FreeBSD.org>

NO_GEOM cleanup:

Retire the "dev_t" centric version of the disk mini-layer.
Remove now unneeded linkage field in dev_t and struct disk.


# beea48b2 27-Feb-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Add support for allocating a device driver major number on demand.

To do this, initialize the d_maj member of the cdevsw to MAJOR_AUTO.
When the cdevsw is first passed to make_dev() a free major number
will be assigned.

Until we have a bit more experience with this a printf will announce
this fact.

Major numbers are not reclaimed, so loading/unloading the same
device driver which uses MAJOR_AUTO will eventually deplete the
pool of free major numbers and the system will panic when it can
not allocate one. Still undecided who to invonvenience with the
solution to this.


# f477b4fd 27-Feb-2003 Poul-Henning Kamp <phk@FreeBSD.org>

NODEVFS cleanup:

Remove cdevsw_add() and cdevsw_remove(), they served us well for a long time.
Bump __FreeBSD_version to 500104 to mark this.


# 07159f9c 24-Feb-2003 Maxime Henrion <mux@FreeBSD.org>

Cleanup of the d_mmap_t interface.

- Get rid of the useless atop() / pmap_phys_address() detour. The
device mmap handlers must now give back the physical address
without atop()'ing it.
- Don't borrow the physical address of the mapping in the returned
int. Now we properly pass a vm_offset_t * and expect it to be
filled by the mmap handler when the mapping was successful. The
mmap handler must now return 0 when successful, any other value
is considered as an error. Previously, returning -1 was the only
way to fail. This change thus accidentally fixes some devices
which were bogusly returning errno constants which would have been
considered as addresses by the device pager.
- Garbage collect the poorly named pmap_phys_address() now that it's
no longer used.
- Convert all the d_mmap_t consumers to the new API.

I'm still not sure wheter we need a __FreeBSD_version bump for this,
since and we didn't guarantee API/ABI stability until 5.1-RELEASE.

Discussed with: alc, phk, jake
Reviewed by: peter
Compile-tested on: LINT (i386), GENERIC (alpha and sparc64)
Runtime-tested on: i386


# 2c6b49f6 21-Feb-2003 Poul-Henning Kamp <phk@FreeBSD.org>

NO_GEOM cleanup:

Retire the "d_dump_t" and use the "dumper_t" type instead.

Dumper_t takes a void * as first arg which is more general than the
dev_t taken by d_dump_t. (Remember: we could have net-dumpers if
somebody wrote us one!)

Define the convention for GEOM controlled disk devices to be that the
first argument to the dumper function is the struct disk pointer.

Change device drivers accordingly.


# 8a63edc3 11-Feb-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Better names for struct disk elements: d_maxsize, d_stripeoffset
and d_stripesisze;

Introduce si_stripesize and si_stripeoffset in struct cdev so we
can make the visible to clustering code.

Add stripesize and stripeoffset to providers.

DTRT with stripesize and stripeoffset in various places in GEOM.


# a0ca480c 10-Feb-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Retire D_CANFREE flag.


# 6cadf88d 11-Feb-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Introduce SI_CANDELETE flag on dev_t.


# 237d2765 04-Feb-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Pave the road to removing the fixed size limit on device nodes:

Change the si_name of dev_t's to be a char * and put a private buffer for
holding the name at then end of the struct.

Initialize si_name to point to the private buffer.

Put a KASSERT in geom_disk to prevent overrun on the fake dev_t we still
have to generate for the disk_drivers.


# 302b6cd1 21-Jan-2003 Peter Wemm <peter@FreeBSD.org>

Remove OBE prototype for iszerodev() - it was replaced by the
D_MMAP_ANON device switch flag.


# 7e760e14 19-Jan-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Originally when DEVFS was added, a global variable "devfs_present"
was used to control code which were conditional on DEVFS' precense
since this avoided the need for large-scale source pollution with
#include "opt_geom.h"

Now that we approach making DEVFS standard, replace these tests
with an #ifdef to facilitate mechanical removal once DEVFS becomes
non-optional.

No functional change by this commit.


# 42c43e60 03-Jan-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Make struct swblock kernel only, to make vm/swap_pager.h userland includable.
Move struct swdevt from sys/conf.h to the more appropriate vm/swap_pager.h.
Adjust #include use in libkvm and pstat(8) to match.


# e2a3ea1c 02-Jan-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unused second argument from DEV_STRATEGY().


# d616ee08 02-Jan-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unused second argument from BIO_STRATEGY()


# 92da00bb 15-Dec-2002 Matthew Dillon <dillon@FreeBSD.org>

This is David Schultz's swapoff code which I am finally able to commit.
This should be considered highly experimental for the moment.

Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU>
MFC after: 3 weeks


# 0db138a6 13-Dec-2002 Kirk McKusick <mckusick@FreeBSD.org>

Only the most recent snapshot contains the complete list of blocks
that were copied in all of the earlier snapshots, thus its precomputed
list must be used in the copyonwrite test. Using incomplete lists may
lead to deadlock. Also do not include the blocks used for the indirect
pointers in the indirect pointers as this may lead to inconsistent
snapshots.

Sponsored by: DARPA & NAI Labs.
Approved by: re


# a2fb4fed 24-Oct-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Fix the spechash lock order reversal by keeping an updated sum
of v_usecount in the dev_t which vcount() can return without
locking any vnodes.

Seen by: jhb


# 2e27173f 27-Sep-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Add a D_NOGIANT flag which can be set in a struct cdevsw to indicate
that a particular device driver is not Giant-challenged.

SPECFS will DROP_GIANT() ... PICKUP_GIANT() around calls to the
driver in question.

Notice that the interrupt path is not affected by this!

This does _NOT_ work for drivers accessed through cdevsw->d_strategy()
ie drivers for disk(-like), some tapes, maybe others.


# ca916247 27-Sep-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Rename struct specinfo to the more appropriate struct cdev.

Agreed on: jake, rwatson, jhb


# 28e7ba4d 21-Apr-2002 Mark Murray <markm@FreeBSD.org>

Used protected names (_foo) for parameter names. This helps clean up
a boatload of lint warnings.


# 3c8451c9 19-Apr-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Add a SI_DUMPDEV flag for devices.

Sponsored by: DARPA & NAI Labs.


# 81661c94 31-Mar-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Here follows the new kernel dumping infrastructure.

Caveats:

The new savecore program is not complete in the sense that it emulates
enough of the old savecores features to do the job, but implements none
of the options yet.

I would appreciate if a userland hacker could help me out getting savecore
to do what we want it to do from a users point of view, compression,
email-notification, space reservation etc etc. (send me email if
you are interested).

Currently, savecore will scan all devices marked as "swap" or "dump" in
/etc/fstab _or_ any devices specified on the command-line.

All architectures but i386 lack an implementation of dumpsys(), but
looking at the i386 version it should be trivial for anybody familiar
with the platform(s) to provide this function.

Documentation is quite sparse at this time, more to come.

Details:

ATA and SCSI drivers should work as the dump formatting code has been
removed. The IDA, TWE and AAC have not yet been converted.

Dumpon now opens the device and uses ioctl(DIOCGKERNELDUMP) to set
the device as dumpdev. To implement the "off" argument, /dev/null
is used as the device.

Savecore will fail if handed any options since they are not (yet)
implemented. All devices marked "dump" or "swap" in /etc/fstab
will be scanned and dumps found will be saved to diskfiles
named from the MD5 hash of the header record. The header record
is dumped in readable format in the .info file. The kernel
is not saved. Only complete dumps will be saved.

All maintainer rights for this code are disclaimed: feel free to
improve and extend.

Sponsored by: DARPA, NAI Labs


# eaf1c66c 30-Mar-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Move the "dumping" variable from systm.h to conf.h.


# c58eb46e 23-Mar-2002 Bruce Evans <bde@FreeBSD.org>

Fixed some style bugs in the removal of __P(()). The main ones were
not removing tabs before "__P((", and not outdenting continuation lines
to preserve non-KNF lining up of code with parentheses. Switch to KNF
formatting and/or rewrap the whole prototype in some cases.


# 789f12fe 19-Mar-2002 Alfred Perlstein <alfred@FreeBSD.org>

Remove __P


# 01de1b13 10-Mar-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Make the proposed name arg to dev_stdclone() const.


# 1fd9f8f4 16-Feb-2002 Brian Feldman <green@FreeBSD.org>

Add revoke_and_destroy_dev(), to be used by devices which decide when
they choose to destroy themselves without regard to whether or not
they are open.


# 5744b3ea 25-Nov-2001 Dima Dorfman <dd@FreeBSD.org>

DEVFS has resurfaced.


# b66b71c9 11-Nov-2001 Warner Losh <imp@FreeBSD.org>

It turns out my reasons for using a few d_thread_t's were bogus. Revert
them back to struct thread *.

Submitteed by: bde


# f73dc36f 10-Nov-2001 Warner Losh <imp@FreeBSD.org>

add note about why I used d_thread_t in the prototypes.


# 991f9760 23-Oct-2001 Jonathan Lemon <jlemon@FreeBSD.org>

Implement multiple low-level console support.


# 7e7c3f3f 17-Oct-2001 Jonathan Lemon <jlemon@FreeBSD.org>

Add dev_named(dev, name), which is similar in spirit to devtoname().
This function returns success if the device is known by either 'name'
or any of its aliases.


# cf7ed683 01-Oct-2001 Warner Losh <imp@FreeBSD.org>

Add d_thread_t. This is a typedef for struct thread in -current and
will be one for struct proc in stable. those drivers needing to have
cross version portability should use d_thread_t instead of inventing
their own means. Non-drivers, and drivers that either only run on
-current or must look under the covers of the struct proc/thread
should must not use this.

As noted in arch@, this minorly violates style(9), but the sys/conf.h
devsw already violates this and all I'm doing is extending the
violation to ease the burdon on device driver writers. It was judged
that this minor violation, which doesn't impact userland or those
people not using it, was preferable to the alternatives (eg #define
proc thread). C does not allow a way to rename or alias structs
easily, so we fall back to using a typedef.

Bump FreeBSD_version to reflect this change (porters guide to be done
in a separate commit).


# b40ce416 12-Sep-2001 Julian Elischer <julian@FreeBSD.org>

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# c7021493 02-Aug-2001 Warner Losh <imp@FreeBSD.org>

Make the fmt arguments to make_dev and make_dev_alias const char *.

Approved on IRC as long as it didn't cause a large number of warnings by: phk

MFC After: 700 hours


# 22cf0fb3 04-Jun-2001 Dima Dorfman <dd@FreeBSD.org>

Unstaticize l_nullioctl; it is needed elsewhere (like in tty_snoop.c).

Suggested by: bde


# 2ac1b1f4 29-May-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unused d_parms_t typedef

Spotted by: grog


# 3344c5a1 26-May-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Create a general facility for making dev_t's depend on another
dev_t. The dev_depends(dev_t, dev_t) function is for tying them
to each other.

When destroy_dev() is called on a dev_t, all dev_t's depending
on it will also be destroyed (depth first order).

Rewrite the make_dev_alias() to use this dependency facility.

kern/subr_disk.c:
Make the disk mini-layer use dependencies to make sure all
relevant dev_t's are removed when the disk disappears.

Make the disk mini-layer precreate some magic sub devices
which the disk/slice/label code expects to be there.

kern/subr_disklabel.c:
Remove some now unneeded variables.

kern/subr_diskmbr.c:
Remove some ancient, commented out code.

kern/subr_diskslice.c:
Minor cleanup. Use name from dev_t instead of dsname()


# 9d9bdb3d 24-May-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Restrict even further what parts of <sys/conf.h> can be seen from
userland.


# fb919e4d 01-May-2001 Mark Murray <markm@FreeBSD.org>

Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by: bde (with reservations)


# f8388051 25-Mar-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Send the remains (such as I have located) of "block major numbers" to
the bit-bucket.


# 589c7af9 07-Mar-2001 Kirk McKusick <mckusick@FreeBSD.org>

Fixes to track snapshot copy-on-write checking in the specinfo
structure rather than assuming that the device vnode would reside
in the FFS filesystem (which is obviously a broken assumption with
the device filesystem).


# edfa785a 23-Feb-2001 Robert Watson <rwatson@FreeBSD.org>

Introduce per-swap area accounting in the VM system, and export
this information via the vm.nswapdev sysctl (number of swap areas)
and vm.swapdevX nodes (where X is the device), which contain the MIBs
dev, blocks, used, and flags. These changes are required to allow
top and other userland swap-monitoring utilities to run without
setgid kmem.

Submitted by: Thomas Moestl <tmoestl@gmx.net>
Reviewed by: freebsd-audit


# 608a3ce6 15-Feb-2001 Jonathan Lemon <jlemon@FreeBSD.org>

Extend kqueue down to the device layer.

Backwards compatible approach suggested by: peter


# a16d0eb2 31-Oct-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Deprecate devsw->d_bmaj entirely.

This removes support for booting current kernels with very old bootblocks.

Device driver writers: Please remove initializations for the d_bmaj
field in your cdevsw{}.


# 7eb9fca5 09-Oct-2000 Eivind Eklund <eivind@FreeBSD.org>

Blow away the v_specmountpoint define, replacing it with what it was
defined as (rdev->si_mountpoint)


# b0d17ba6 19-Sep-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Rename lminor() to dev2unit(). This function gives a linear unit number
which hides the 'hole' in the minor bits.

Introduce unit2minor() to do the reverse operation.

Fix some some make_dev() calls which didn't use UID_* or GID_* macros.

Kill the v_hashchain alias macro, it hides the real relationship.

Introduce experimental SI_CHEAPCLONE flag set it on cloned bpfs.


# a2cdd9d9 16-Sep-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Move SPECNAMELEN from <sys/conf.h> to <sys/param.h>


# e2397c3a 11-Sep-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Prevent multiple make_dev() calls on the same dev_t and similar bogosities.
A couple of new warnings may be emitted during boot if drivers DTWT.

Tested by: George Cox <gjvc@gjvc.com>


# db901281 02-Sep-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Avoid the modules madness I inadvertently introduced by making the
cloning infrastructure standard in kern_conf. Modules are now
the same with or without devfs support.

If you need to detect if devfs is present, in modules or elsewhere,
check the integer variable "devfs_present".

This happily removes an ugly hack from kern/vfs_conf.c.

This forces a rename of the eventhandler and the standard clone
helper function.

Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include
like <sys/queue.h>

Remove all #includes of opt_devfs.h they no longer matter.


# a481b90b 24-Aug-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Fix panic when removing open device (found by bp@)
Implement subdirs.
Build the full "devicename" for cloning functions.
Fix panic when deleted device goes away.
Collaps devfs_dir and devfs_dirent structures.
Add proper cloning to the /dev/fd* "device-"driver.
Fix a bug in make_dev_alias() handling which made aliases appear
multiple times.
Use devfs_clone to implement getdiskbyname()
Make specfs maintain the stat(2) timestamps per dev_t


# 3f54a085 20-Aug-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Remove all traces of Julians DEVFS (incl from kern/subr_diskslice.c)

Remove old DEVFS support fields from dev_t.

Make uid, gid & mode members of dev_t and set them in make_dev().

Use correct uid, gid & mode in make_dev in disk minilayer.

Add support for registering alias names for a dev_t using the
new function make_dev_alias(). These will show up as symlinks
in DEVFS.

Use makedev() rather than make_dev() for MFSs magic devices to prevent
DEVFS from noticing this abuse.

Add a field for DEVFS inode number in dev_t.

Add new DEVFS in fs/devfs.

Add devfs cloning to:
disk minilayer (ie: ad(4), sd(4), cd(4) etc etc)
md(4), tun(4), bpf(4), fd(4)

If DEVFS add -d flag to /sbin/inits args to make it mount devfs.

Add commented out DEVFS to GENERIC


# b7ffb342 03-Jul-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Pull the rug under block mode devices. they return ENXIO on open(2) now.


# 3fce6910 25-Jun-2000 Mark Murray <markm@FreeBSD.org>

Add extra flag needed by nulldev/mmap.

Thanks to: Jeroen van Gelderen
Reviewed by: dfr


# e3975643 25-May-2000 Jake Burkholder <jake@FreeBSD.org>

Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by: msmith and others


# 740a1973 23-May-2000 Jake Burkholder <jake@FreeBSD.org>

Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by: phk
Reviewed by: phk
Approved by: mdodd


# 192c06ea 09-May-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Change the "bdev-whiner" to whine when open is attempted and extend
the deadline a month.


# 017ef345 01-May-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Give struct bio it's own call back mechanism.


# 67f3c95c 25-Apr-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Clone the {b|bio}_offset field, and make sure it is always initialized
in struct bio. Eventually, bio_offset will probably obsolete the
bio_blkno and bio_pblkno fields.

Remove the special hack in atapi-cd.c to determine of bio_offset was valid.


# 8177437d 14-Apr-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Complete the bio/buf divorce for all code below devfs::strategy

Exceptions:
Vinum untouched. This means that it cannot be compiled.
Greg Lehey is on the case.

CCD not converted yet, casts to struct buf (still safe)

atapi-cd casts to struct buf to examine B_PHYS


# 16aae9cb 20-Mar-2000 Brian Feldman <green@FreeBSD.org>

Split the logic of
static int setrootbyname(char *name);
out into
dev_t getdiskbyname(char *name);

This makes it easy to create a new DDB command, which is the big reason
for the change. You can now do the following in DDB:

Example rc.conf entry:
dumpdev="/dev/ad0s1b" # Device name to crashdump to (if enabled).

db> show disk/ad0s1b
dev_t = 0xc0b7ea00
db> p *dumpdev
c0b7ea00


# ce6acbb6 19-Mar-2000 Poul-Henning Kamp <phk@FreeBSD.org>

diff, patch and cvs didn't like these three last time around, try again.


# 21144e3b 20-Mar-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new
field in struct buf: b_iocmd. The b_iocmd is enforced to have
exactly one bit set.

B_WRITE was bogusly defined as zero giving rise to obvious coding
mistakes.

Also eliminate the redundant struct buf flag B_CALL, it can just
as efficiently be done by comparing b_iodone to NULL.

Should you get a panic or drop into the debugger, complaining about
"b_iocmd", don't continue. It is likely to write on your disk
where it should have been reading.

This change is a step in the direction towards a stackable BIO capability.

A lot of this patch were machine generated (Thanks to style(9) compliance!)

Vinum users: Greg has not had time to test this yet, be careful.


# e8359a57 07-Feb-2000 Søren Schmidt <sos@FreeBSD.org>

Do refcounting of open devices (more) correctly.

count_dev funtion by phk.


# 664a31e4 28-Dec-1999 Peter Wemm <peter@FreeBSD.org>

Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.


# 38941f35 29-Nov-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Remove the now unused chrtoblk() function.


# 71e4fff8 26-Nov-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Retire MFS_ROOT and MFS_ROOT_SIZE options from the MFS implementation.

Add MD_ROOT and MD_ROOT_SIZE options to the md driver.

Make the md driver handle MFS_ROOT and MFS_ROOT_SIZE options for compatibility.

Add md driver to GENERIC, PCCARD and LINT.

This is a cleanup which removes the need for some of the worse hacks in
MFS: We really want to have a rootvnode but MFS on a preloaded image
doesn't really have one. md is a true device, so it is less trouble.

This has been tested with make release, and if people remember to add
the "md" pseudo-device to their kernels, PicoBSD should be just fine
as well. If people have no other use for MFS, it can be removed from
the kernel.


# 900942ba 08-Nov-1999 Peter Wemm <peter@FreeBSD.org>

Zap devsw_module_handler().


# 44d1184e 08-Nov-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Rename remove_dev() to destroy_dev().

Nagged about by: msmith


# 30d5764d 07-Nov-1999 Peter Wemm <peter@FreeBSD.org>

Don't indirect via devsw_module_handler() for DEV_MODULE() routines, have
the supplied (if any) function and argument called directly.


# be8479a8 06-Nov-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Remove the iskmemdev() function. Make it the responsibility of the mem.c
drivers to enforce the securelevel checks.


# d53dedee 07-Nov-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Remove the devsw magic from DEV_MODULE()


# adab70d6 03-Oct-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Warn once per driver about dev_t's not registered with make_dev().


# b89392e7 30-Sep-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Remove the D_NOCLUSTER[RW] options which were added because vn had
problems. Now that Matt has fixed vn, this can go. The vn driver
should have used d_maxio (now si_iosize_max) anyway.


# 90e928d6 25-Sep-1999 Poul-Henning Kamp <phk@FreeBSD.org>

For some reason patch didn't remove these three lines first time around.


# d6a0e38a 25-Sep-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Remove five now unused fields from struct cdevsw. They should never
have been there in the first place. A GENERIC kernel shrinks almost 1k.

Add a slightly different safetybelt under nostop for tty drivers.

Add some missing FreeBSD tags


# ae8e1d08 25-Sep-1999 Poul-Henning Kamp <phk@FreeBSD.org>

This patch clears the way for removing a number of tty related
fields in struct cdevsw:

d_stop moved to struct tty.
d_reset already unused.
d_devtotty linkage now provided by dev_t->si_tty.

These fields will be removed from struct cdevsw together with
d_params and d_maxio Real Soon Now.

The changes in this patch consist of:

initialize dev->si_tty in *_open()
initialize tty->t_stop
remove devtotty functions
rename ttpoll to ttypoll
a few adjustments to these changes in the generic code
a bump of __FreeBSD_version
add a couple of FreeBSD tags


# c428d4c0 22-Sep-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Kill the cdevsw->d_maxio field.

d_maxio is replaced by the dev->si_iosize_max field which the driver
should be set in all calls to cdevsw->d_open if it has a better
idea than the system wide default.

The field is a generic dev_t field (ie: not disk specific) so that
tapes and other devices can use physio as well.


# fae03f66 20-Sep-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Step one of replacing devsw->d_maxio with si_bsize_max.

Rename dev->si_bsize_max to si_iosize_max and set it in spec_open
if the device didn't.

Set vp->v_maxio from dev->si_bsize_max in spec_open rather than
in ufs_bmap.c


# c32cc149 12-Sep-1999 Bruce Evans <bde@FreeBSD.org>

Const'ify devtoname() and d_name. This exposes some errors (2 non-benign).

Handle negative minor numbers properly in devtoname().


# 49c68457 12-Sep-1999 Brian Feldman <green@FreeBSD.org>

Correction: mem.c devices are "D_MEM" (and D_MEM is added.)

Taken issue with by: phk


# 7012bab9 02-Sep-1999 Julian Elischer <julian@FreeBSD.org>

Revert a bunch of contraversial changes by PHK. After
a quick think and discussion among various people some form of some of
these changes will probably be recommitted.

The reversion requested was requested by dg while discussions proceed.
PHK has indicated that he can live with this, and it has been agreed
that some form of some of these changes may return shortly after further
discussion.


# c5b72c3d 30-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

s/si_tty_tty/si_tty/g


# 02e15769 30-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Make bdev userland access work like cdev userland access unless
the highly non-recommended option ALLOW_BDEV_ACCESS is used.

(bdev access is evil because you don't get write errors reported.)

Kill si_bsize_best before it kills Matt :-)

Use the specfs routines rather having cloned copies in devfs.


# da9e4f55 29-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Add micro "disk" layer which should enable us to pull all the slice/label
stuff out of the device drivers.


# d137accc 29-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Add dev_t freeing code. Controlled by sysctl debug.free_devt, default
is off.


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# dbafb366 26-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Simplify the handling of VCHR and VBLK vnodes using the new dev_t:

Make the alias list a SLIST.

Drop the "fast recycling" optimization of vnodes (including
the returning of a prexisting but stale vnode from checkalias).
It doesn't buy us anything now that we don't hardlimit
vnodes anymore.

Rename checkalias2() and checkalias() to addalias() and
addaliasu() - which takes dev_t and udev_t arg respectively.

Make the revoke syscalls use vcount() instead of VALIASED.

Remove VALIASED flag, we don't need it now and it is faster
to traverse the much shorter lists than to maintain the
flag.

vfs_mountedon() can check the dev_t directly, all the vnodes
point to the same one.

Print the devicename in specfs/vprint().

Remove a couple of stale LFS vnode flags.

Remove unimplemented/unused LK_DRAINED;


# 2a1be833 25-Aug-1999 Julian Elischer <julian@FreeBSD.org>

Make a place to store the devfs hook for the block device, as the same
specinfo is used to identify both raw and block version sof a device.


# 9dcbe240 23-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Convert DEVFS hooks in (most) drivers to make_dev().

Diskslice/label code not yet handled.

Vinum, i4b, alpha, pc98 not dealt with (left to respective Maintainers)

Add the correct hook for devfs to kern_conf.c

The net result of this excercise is that a lot less files depends on DEVFS,
and devtoname() gets more sensible output in many cases.

A few drivers had minor additional cleanups performed relating to cdevsw
registration.

A few drivers don't register a cdevsw{} anymore, but only use make_dev().


# 1744fcd0 20-Aug-1999 Julian Elischer <julian@FreeBSD.org>

First small steps at merging DEVFS and PHK's Dev_t stuff.


# b8e49f68 17-Aug-1999 Bill Fumerola <billf@FreeBSD.org>

Welcome devtoname(), to most likely be used when printing information
about a dev_t.

printf("%x", dev) now becomes printf("%s", devtoname(dev)) because
printing actual information about the device is much more useful then
printing a pointer to an address that would never help the developer debug.

Submitted by: phk, bde


# 9a27d579 15-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Introduce lminor(dev_t dev), which returns a linear minor number,
ie: hides the fact that the major number is stuck in the middle.


# 49ff4deb 14-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Spring cleaning around strategy and disklabels/slices:

Introduce BUF_STRATEGY(struct buf *, int flag) macro, and use it throughout.
please see comment in sys/conf.h about the flag argument.

Remove strategy argument from all the diskslice/label/bad144
implementations, it should be found from the dev_t.

Remove bogus and unused strategy1 routines.

Remove open/close arguments from dssize(). Pick them up from dev_t.

Remove unused and unfinished setgeom support from diskslice/label/bad144 code.


# 2820b2e7 13-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Add support for device drivers which want to track all open/close
operations. This allows a device driver better insight into
what is going on that the current:

proc1: open /dev/foo R/O
devsw->open( R/O, proc1 )
proc2: open /dev/foo R/W
devsw->open( R/W, proc2 )
proc2: close
/* nothing, but device is
really only R/O open */
proc1: close
devsw->close( R/O, proc1 )


# 7dc5cd04 13-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

The bdevsw() and cdevsw() are now identical, so kill the former.


# 4d4f9323 13-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

s/v_specinfo/v_rdev/


# a4f59f82 09-Aug-1999 John Polstra <jdp@FreeBSD.org>

Include <sys/queue.h> since this header now depends on it.


# 0ef1c826 08-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>,
a few lines into <sys/vnode.h>.

Add a few fields to struct specinfo, paving the way for the fun part.


# 698bfad7 20-Jul-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Now a dev_t is a pointer to struct specinfo which is shared by all specdev
vnodes referencing this device.

Details:
cdevsw->d_parms has been removed, the specinfo is available
now (== dev_t) and the driver should modify it directly
when applicable, and the only driver doing so, does so:
vn.c. I am not sure the logic in checking for "<" was right
before, and it looks even less so now.

An intial pool of 50 struct specinfo are depleted during
early boot, after that malloc had better work. It is
likely that fewer than 50 would do.

Hashing is done from udev_t to dev_t with a prime number
remainder hash, experiments show no better hash available
for decent cost (MD5 is only marginally better) The prime
number used should not be close to a power of two, we use
83 for now.

Add new checkalias2() to get around the loss of info from
dev2udev() in bdevvp();

The aliased vnodes are hung on a list straight of the dev_t,
and speclisth[SPECSZ] is unused. The sharing of struct
specinfo means that the v_specnext moves into the vnode
which grows by 4 bytes.

Don't use a VBLK dev_t which doesn't make sense in MFS, now
we hang a dummy cdevsw on B/Cmaj 253 so that things look sane.

Storage overhead from all of this is O(50k).

Bump __FreeBSD_version to 400009

The next step will add the stuff needed so device-drivers can start to
hang things from struct specinfo


# 9806ce5b 17-Jul-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Add a field to struct swdevt to avoid a bogus udev2dev() call.


# 0431da87 06-Jul-1999 Mike Smith <msmith@FreeBSD.org>

Reinstate the previous fix for the broken export of a dev_t in sw_dev, convert
back to a dev_t when the value is actually used.


# 4116f679 06-Jul-1999 Brian Feldman <green@FreeBSD.org>

Back out previous commit. It was wrong, and caused panics.


# d40e02b3 06-Jul-1999 Mike Smith <msmith@FreeBSD.org>

swdevt should contain a udev_t not a devt. This resulted in bogus
swap device name reporting.

Submitted by: Bill Swingle <unfurl@freebsd.org>


# e26b8fd6 04-Jul-1999 Poul-Henning Kamp <phk@FreeBSD.org>

fix DEV_MODULE, I overlooked this one in my last commit


# 03016f42 04-Jul-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Remove cmaj and bmaj args from DEV_DRIVER_MODULE.


# 9a9eb2b9 25-Jun-1999 Greg Lehey <grog@FreeBSD.org>

Add function cdevsw_remove, the opposite of cdevsw_add: remove an
entry in cdevsw (and bdevsw if appropriate).

Reviewed-by: phk


# 6fcd8a7c 01-Jun-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Introduce the makebdev() function, it does the same as the makedev()
function for now, but that will change.


# 2447bec8 31-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Simplify cdevsw registration.

The cdevsw_add() function now finds the major number(s) in the
struct cdevsw passed to it. cdevsw_add_generic() is no longer
needed, cdevsw_add() does the same thing.

cdevsw_add() will print an message if the d_maj field looks bogus.

Remove nblkdev and nchrdev variables. Most places they were used
bogusly. Instead check a dev_t for validity by seeing if devsw()
or bdevsw() returns NULL.

Move bdevsw() and devsw() functions to kern/kern_conf.c

Bump __FreeBSD_version to 400006

This commit removes:
72 bogus makedev() calls
26 bogus SYSINIT functions

if_xe.c bogusly accessed cdevsw[], author/maintainer please fix.

I4b and vinum not changed. Patches emailed to authors. LINT
probably broken until they catch up.


# 4e2f199e 30-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

This commit should be a extensive NO-OP:

Reformat and initialize correctly all "struct cdevsw".

Initialize the d_maj and d_bmaj fields.

The d_reset field was not removed, although it is never used.

I used a program to do most of this, so all the files now use the
same consistent format. Please keep it that way.

Vinum and i4b not modified, patches emailed to respective authors.


# a89c3bd3 12-May-1999 Peter Wemm <peter@FreeBSD.org>

dev is a pointer, printf it as such


# bfbb9ce6 11-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Divorce "dev_t" from the "major|minor" bitmap, which is now called
udev_t in the kernel but still called dev_t in userland.

Provide functions to manipulate both types:
major() umajor()
minor() uminor()
makedev() umakedev()
dev2udev() udev2dev()

For now they're functions, they will become in-line functions
after one of the next two steps in this process.

Return major/minor/makedev to macro-hood for userland.

Register a name in cdevsw[] for the "filedescriptor" driver.

In the kernel the udev_t appears in places where we have the
major/minor number combination, (ie: a potential device: we
may not have the driver nor the device), like in inodes, vattr,
cdevsw registration and so on, whereas the dev_t appears where
we carry around a reference to a actual device.

In the future the cdevsw and the aliased-from vnode will be hung
directly from the dev_t, along with up to two softc pointers for
the device driver and a few houskeeping bits. This will essentially
replace the current "alias" check code (same buck, bigger bang).

A little stunt has been provided to try to catch places where the
wrong type is being used (dev_t vs udev_t), if you see something
not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if
it makes a difference. If it does, please try to track it down
(many hands make light work) or at least try to reproduce it
as simply as possible, and describe how to do that.

Without DEVT_FASCIST I belive this patch is a no-op.

Stylistic/posixoid comments about the userland view of the <sys/*.h>
files welcome now, from userland they now contain the end result.

Next planned step: make all dev_t's refer to the same devsw[] which
means convert BLK's to CHR's at the perimeter of the vnodes and
other places where they enter the game (bootdev, mknod, sysctl).


# 52400704 09-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Unconfuse DEV_MODULE() and DEV_DRIVER_MODULE() about the difference between
a major number for a dev_t.


# dd3ebe6b 09-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Don't confuse dev_t and major numbers in DEV_MODULE()


# 4be2eb8c 08-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

I got tired of seeing all the cdevsw[major(foo)] all over the place.

Made a new (inline) function devsw(dev_t dev) and substituted it.

Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)

DEVFS will eventually benefit from this change too.


# 46eede00 07-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Continue where Julian left off in July 1998:

Virtualize bdevsw[] from cdevsw. bdevsw() is now an (inline)
function.

Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention
to the order of the cmaj/bmaj arguments!)

Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE
(ditto!)

(Next step will be to convert all bdev dev_t's to cdev dev_t's
before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)


# 155f87da 24-Feb-1999 Matthew Dillon <dillon@FreeBSD.org>

Reviewed by: Julian Elischer <julian@whistle.com>

Add d_parms() to {c,b}devsw[]. If non-NULL this function points to
a device routine that will properly fill in the specinfo structure.
vfs_subr.c's checkalias() supplies appropriate defaults. This change
should be fully backwards compatible with existing devices.


# 334fab0d 21-Jan-1999 Peter Wemm <peter@FreeBSD.org>

Typo: s/mdev/bdev/, added in rev 1.47 (must fix in RELENG_3 too)


# 14177d72 14-Nov-1998 Garrett Wollman <wollman@FreeBSD.org>

My changes to the new device interface:

- Interface wth the new resource manager.
- Allow for multiple drivers implementing a single devclass.
- Remove ordering dependencies between header files.
- Style cleanup.
- Add DEVICE_SUSPEND and DEVICE_RESUME methods.
- Move to a single-phase interrupt setup scheme.

Kernel builds on the Alpha are brken until Doug gets a chance to incorporate
these changes on that side.

Agreed to in principle by: dfr


# d44f5a8a 10-Nov-1998 Doug Rabson <dfr@FreeBSD.org>

Allow the use of NODEV in CDEV_MODULE and BDEV_MODULE to make the system
auto-allocate the major number. Not terribly useful without DEVFS.


# 7095ee91 07-Nov-1998 Doug Rabson <dfr@FreeBSD.org>

* Fix a couple of places in the device pager where an address was
truncated to 32 bits.
* Change the calling convention of the device mmap entry point to
pass a vm_offset_t instead of an int for the offset allowing
devices with a larger memory map than (1<<32) to be supported
on the alpha (/dev/mem is one such).

These changes are required to allow the X server to mmap the various
I/O regions used for device port and memory access on the alpha.


# 0375c9f2 05-Sep-1998 Poul-Henning Kamp <phk@FreeBSD.org>

Add a new vnode op, VOP_FREEBLKS(), which filesystems can use to inform
device drivers about sectors no longer in use.

Device-drivers receive the call through d_strategy, if they have
D_CANFREE in d_flags.

This allows flash based devices to erase the sectors and avoid
pointlessly carrying them around in compactions.

Reviewed by: Kirk Mckusick, bde
Sponsored by: M-Systems (www.m-sys.com)


# 5879dcdb 20-Aug-1998 Bruce Evans <bde@FreeBSD.org>

Moved `nx' functions to the one place where they are used (su.c).
They shouldn't be used there either. They should have gone away
about 3 years ago when the statically initialized devswitches went
away, but su.c unfortunately still frobs the cdevswitch in the old
way.


# f7ea2f55 04-Jul-1998 Julian Elischer <julian@FreeBSD.org>

There is no such thing any more as "struct bdevsw".

There is only cdevsw (which should be renamed in a later edit to deventry
or something). cdevsw contains the union of what were in both bdevsw an
cdevsw entries. The bdevsw[] table stiff exists and is a second pointer
to the cdevsw entry of the device. it's major is in d_bmaj rather than
d_maj. some cleanup still to happen (e.g. dsopen now gets two pointers
to the same cdevsw struct instead of one to a bdevsw and one to a cdevsw).

rawread()/rawwrite() went away as part of this though it's not strictly
the same patch, just that it involves all the same lines in the drivers.

cdroms no longer have write() entries (they did have rawwrite (?)).
tapes no longer have support for bdev operations.

Reviewed by: Eivind Eklund and Mike Smith
Changes suggested by eivind.


# b6259105 25-Jun-1998 Poul-Henning Kamp <phk@FreeBSD.org>

Remove bdevsw_add(), change the only two users to use bdevsw_add_generic().
Extend cdevsw to be superset of bdevsw.
Remove non-functional bdev lkm support.
Teach wcd what the open() args mean.


# 084d9853 17-Jun-1998 Bruce Evans <bde@FreeBSD.org>

Don't declare isa device structs or isa interrupt handlers in <sys/conf>,
and don't depend on them being declared there. This will cause lots of
warnings for a few minutes until config is updated. Interrupt handlers
should never have been configured by config, and the machine generated
declarations get in the way of changing the arg type from int to void *.


# ecbb00a2 07-Jun-1998 Doug Rabson <dfr@FreeBSD.org>

This commit fixes various 64bit portability problems required for
FreeBSD/alpha. The most significant item is to change the command
argument to ioctl functions from int to u_long. This change brings us
inline with various other BSD versions. Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.


# a4daaa09 12-Feb-1998 Poul-Henning Kamp <phk@FreeBSD.org>

Implement the spirit but not the letter of Terrys hot-char patch.

The differences Terrys patch and this patch are:
* Remove a lot of un-needed comments.
* Don't put l_hotchar at the front of stuct linesw, there is no need to.
* Use the #defines for the hotchar in the SLIP and PPP line disciplines


# 50ce7ff4 23-Jan-1998 John Dyson <dyson@FreeBSD.org>

Add better support for larger I/O clusters, including larger physical
I/O. The support is not mature yet, and some of the underlying implementation
needs help. However, support does exist for IDE devices now.


# cb451ebd 22-Nov-1997 Bruce Evans <bde@FreeBSD.org>

Staticized.


# 81bca6dd 27-Sep-1997 KATO Takenori <kato@FreeBSD.org>

Clustered read and write are switched at mount-option level.

1. Clustered I/O is switched by the MNT_NOCLUSTERR and MNT_NOCLUSTERW
bits of the mnt_flag. The sysctl variables, vfs.foo.doclusterread
and vfs.foo.doclusterwrite are deleted. Only mount option can
control clustered I/O from userland.
2. When foofs_mount mounts block device, foofs_mount checks D_CLUSTERR
and D_CLUSTERW bits of the d_flags member in the block device switch
table. If D_NOCLUSTERR / D_NOCLUSTERW are set, MNT_NOCLUSTERR /
MNT_NOCLUSTERW bits will be set. In this case, MNT_NOCLUSTERR and
MNT_NOCLUSTERW cannot be cleared from userland.
3. Vnode driver disables both clustered read and write.
4. Union filesystem disables clutered write.

Reviewed by: bde


# 3a74593f 13-Sep-1997 Peter Wemm <peter@FreeBSD.org>

Update interfaces for poll()


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 996c772f 09-Feb-1997 John Dyson <dyson@FreeBSD.org>

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# cba8a5dd 23-Jul-1996 Poul-Henning Kamp <phk@FreeBSD.org>

Make a "DWIM" function for adding [bc]devsw entries for bdev drivers.

Saves about 280 butes of source per driver, 56 bytes in object size
and another 56 bytes moves from data to bss.

No functional change intended nor expected.

GENERIC should be about one k smaller now :-)


# 02e2c406 11-Mar-1996 Peter Wemm <peter@FreeBSD.org>

Import 4.4BSD-Lite2 onto the vendor branch, note that in the kernel, all
files are off the vendor branch, so this should not change anything.

A "U" marker generally means that the file was not changed in between
the 4.4Lite and Lite-2 releases, and does not need a merge. "C" generally
means that there was a change.
[new sys/syscallargs.h file, to be "cvs rm"ed]


# 269f85f0 10-Mar-1996 Jeffrey Hsu <hsu@FreeBSD.org>

Merge in Lite2: stylistic changes to function prototypes
add comments
Did not accept change of second argument of ioctl prototype from int to u_long.
Did not merge in changes to fields in bdevsw and cdevsw.
Reviewed by: davidg & bde


# d1022821 14-Dec-1995 Bruce Evans <bde@FreeBSD.org>

Removed my devsw access functions [un]register_cdev() and
getmajorbyname() which were a better (sigh) temporary interface to
the going-away devswitches.

Note that SYSINIT()s to initialize the devswitches would be fatal
in syscons.c and pcvt_drv.c (and are bogus elsewhere) because they
get called independently of whether the device is attached; thus
devices that share a major clobber each other's devswitch entries
until the last one wins.

conf.c:
Removed stale #includes and comments.


# 6ba9ebce 13-Dec-1995 Julian Elischer <julian@FreeBSD.org>

devsw tables are now arrays of POINTERS to struct [cb]devsw
seems to work hre just fine though I can't check every file
that changed due to limmited h/w, however I've checked enught to be petty
happy withe hte code..

WARNING... struct lkm[mumble] has changed
so it might be an idea to recompile any lkm related programs


# c73feca0 10-Dec-1995 Bruce Evans <bde@FreeBSD.org>

Removed new alias d_size_t for d_psize_t.

Removed old aliases d_rdwr_t and d_ttycv_t for d_read_t/d_write_t and
d_devtotty_t.

Sorted declarations of switch functions into switch order.

Removed duplicated comments and declarations of nonexistent switch
functions.


# 154c04e5 10-Dec-1995 Poul-Henning Kamp <phk@FreeBSD.org>

Last commit this round: Staticize.
we are now down to about 1146 symbols being global, of which
I estimate that about 100 are validly so.


# 87f6c662 08-Dec-1995 Julian Elischer <julian@FreeBSD.org>

Pass 3 of the great devsw changes
most devsw referenced functions are now static, as they are
in the same file as their devsw structure. I've also added DEVFS
support for nearly every device in the system, however
many of the devices have 'incorrect' names under DEVFS
because I couldn't quickly work out the correct naming conventions.
(but devfs won't be coming on line for a month or so anyhow so that doesn't
matter)

If you "OWN" a device which would normally have an entry in /dev
then search for the devfs_add_devsw() entries and munge to make them right..
check out similar devices to see what I might have done in them in you
can't see what's going on..
for a laugh compare conf.c conf.h defore and after... :)
I have not doen DEVFS entries for any DISKSLICE devices yet as that will be
a much more complicated job.. (pass 5 :)

pass 4 will be to make the devsw tables of type (cdevsw * )
rather than (cdevsw)
seems to work here..
complaints to the usual places.. :)


# 9dcfe99c 05-Dec-1995 Bruce Evans <bde@FreeBSD.org>

Moved prototypes for rawread(), rawrite() and setconf() to a better place.

Restored order.


# 7198bf47 29-Nov-1995 Julian Elischer <julian@FreeBSD.org>

If you're going to mechanically replicate something in 50 files
it's best to not have a (compiles cleanly) typo in it! (sigh)


# 8b25681e 05-Nov-1995 Bruce Evans <bde@FreeBSD.org>

Replaced bogus macros for dummy devswitch entries by functions.
These functions went away:

enosys (hasn't been used for some time)
enxio
enodev
enoioctl (was used only once, actually for a vop)

if_tun.c:
Continued cleaning up...

conf.h:
Probably fixed the type of d_reset_t. It is hard to tell the correct
type because there are no non-dummy device reset functions.

Removed last vestige of ambiguous sleep message strings.


# 286f8561 05-Nov-1995 Bruce Evans <bde@FreeBSD.org>

Replaced bogus macros for entry points to unconfigured line disciplines
by functions.

tty_conf.c:
Cleaned up formatting of tables.

Removed another ARGSUSED for consistency.

conf.h:
Introduced typedefs for line discipline functions.

Backed out most of previous revision (it is done elsewhere).


# c6b45cc0 05-Nov-1995 Peter Wemm <peter@FreeBSD.org>

Workaround for conflicting kernel prototypes in user mode breaking make world

I've moved the #include <machine/conf.h> inside the #ifdef KERNEL
becuause things like modload #include <sys/conf.h> for the cdevsw definitions
but dont need the conflicting prototype for fdopen in stdio.h and
machine/conf.h.

Bruce may have a better fix for this, but for now I need a make world..


# 4fda91c7 04-Nov-1995 Bruce Evans <bde@FreeBSD.org>

Moved prototypes for devswitch functions from conf.c and driver sources
to <machine/conf.h>. conf.h was mechanically generated by
`grep ^d_ conf.c >conf.h'. This accounts for part of its ugliness. The
prototypes should be moved back to the driver sources when the functions
are staticalized.


# 2f5585d1 03-Oct-1995 Julian Elischer <julian@FreeBSD.org>

add the file kern_conf.c so it can be compiled in when needed
for testing.. (for cdevsw_add and bdevsw_add)

make the lkm code use the new generic devsw add routines (if so required)
define these routines in conf.h so we can use them


# e7451974 10-Sep-1995 Bruce Evans <bde@FreeBSD.org>

Make pcvt and syscons live in the same kernel. If both are enabled, then
the first one in the config has priority. They can be switched using
userconfig().

i386/i386/conf.c:
Initialize the shared syscons/pcvt cdevsw entry to `nx'.

Add cdevsw registration functions.

Use devsw functions of the correct type if they exist.

i386/i386/cons.c:
Add renamed syscons entry points to constab.

i386/i386/cons.h:
Declare the renamed syscons entry points.

i386/i386/machdep.c:
Repeat console initialization after userconfig() in case the current
console has become wrong. This depends on cn functions not wiring down
anything important.

sys/conf.h:
Declare new functions.

i386/isa/isa.[ch]:
Add a function to decide which display driver has priority. Should be
done better.

i386/isa/syscons.c:
Rename pccn* -> sccn*.

Initialize CRTC start address in case the previous driver has moved it.

i386/isa/syscons.c, i386/isa/pcvt/*
Initialize the bogusly shared variable Crtat dynamically in case the
stored value was changed by the previous driver.

Initialize cdevsw table from a template.

Don't grab the console if another display driver has priority.

i386/isa/syscons.h, i386/isa/pcvt/pcvt_hdr.h:
Don't externally declare now-static cdevsw functions.

i386/isa/pcvt/pcvt_hdr.h:
Set the sensitive hardware flag so that pcvt doesn't always have lower
priority than syscons. This also fixes the "stupid" detection of the
display after filling the display with text.

i386/isa/pcvt/pcvt_out.c:
Don't be confused the off-screen cursor offset 0xffff set by syscons.

kern/subr_xxx.c:
Add enough nxio/nodev/null devsw functions of the correct type for syscons
and pcvt.


# 38f93a35 08-Sep-1995 Bruce Evans <bde@FreeBSD.org>

d_stop functions always returned void. Fix the declaration of d_stop_t
to match.


# 9b2e5354 30-May-1995 Rodney W. Grimes <rgrimes@FreeBSD.org>

Remove trailing whitespace.


# a401ebbe 13-May-1995 David Greenman <dg@FreeBSD.org>

Changed swap partition handling/allocation so that it doesn't
require specific partitions be mentioned in the kernel config
file ("swap on foo" is now obsolete).

From Poul-Henning:

The visible effect is this:

As default, unless
options "NSWAPDEV=23"
is in your config, you will have four swap-devices.
You can swapon(2) any block device you feel like, it doesn't have
to be in the kernel config.

There is a performance/resource win available by getting the NSWAPDEV right
(but only if you have just one swap-device ??), but using that as default
would be too restrictive.

The invisible effect is that:

Swap-handling disappears from the $arch part of the kernel.
It gets a lot simpler (-145 lines) and cleaner.

Reviewed by: John Dyson, David Greenman
Submitted by: Poul-Henning Kamp, with minor changes by me.


# 5bb4f738 12-May-1995 Garrett Wollman <wollman@FreeBSD.org>

The death of `options NODUMP'. Now the dump area can be dynamically
configured (and unconfigured) on the fly. A sysctl(3) MIB variable is
provided to inspect and modify the dump device setting.


# f81904fe 23-Apr-1995 Bruce Evans <bde@FreeBSD.org>

Declare d_dump_t and d_mmap_t completely. Nothing depends on the
incomplete declarations here any more. Some things depend on
incomplete declarations elsewhere. The `offset' arg to d_mmap_t is
bogus (it is `int' but should be `vm_offset_t') but it is what the
driver mmap functions actually accept, although they are passed a
`vm_offset_t'.

Function declararions in headers should always be complete to avoid
warnings from `gcc -Wstrict-prototypes' for compiling modules that
don't even use the offending declarations.


# 3749fcff 21-Mar-1995 Peter Dufault <dufault@FreeBSD.org>

Set it so you can add and remove line disciplines without replicating
code for looking for open slots in table (and you could hide the table
if you wanted to).


# b5e8ce9f 16-Mar-1995 Bruce Evans <bde@FreeBSD.org>

Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'. Fix all the bugs found. There were no serious
ones.


# 77f77631 25-Feb-1995 Paul Traina <pst@FreeBSD.org>

(a) remove the pointer to each driver's tty structure array from cdevsw
(b) add a function callback vector to tty drivers that will return a pointer
to a valid tty structure based upon a dev_t
(c) make syscons structures the same size whether or not APM is enabled so
utilities don't crash if NAPM changes (and make the damn kernel compile!)
(d) rewrite /dev/snp ioctl interface so that it is device driver and i386
independant


# 62f8f85b 09-Feb-1995 David Greenman <dg@FreeBSD.org>

Clean up after Jordan's commit: add d_read_t and d_write_t types for
compatibility with the screwy conf.c macros.


# 9e2429f5 22-Jan-1995 Poul-Henning Kamp <phk@FreeBSD.org>

Moved the typedefs of d_<foo>_t into sys/sys/conf.h and used them in
the definition of struct [cb]devsw. Guess Bruce never got around to
complete this (?)

Poul-Henning


# 51cd3c96 11-Dec-1994 Bruce Evans <bde@FreeBSD.org>

Declare d_strategy_t here and use it to declare strategy functions.
Sort prototypes.
Uniformize idempotency ifdef.


# 7520c44d 04-Dec-1994 Jordan K. Hubbard <jkh@FreeBSD.org>

chrtoblk() returns dev_t, not int.


# b4a8d575 08-Oct-1994 Poul-Henning Kamp <phk@FreeBSD.org>

Added prototypes here and there. Moved pfctlinput into socket.h.


# af9da405 20-Aug-1994 Paul Richards <paul@FreeBSD.org>

Made them all idempotent.
Reviewed by:
Submitted by:


# 3c4dd356 02-Aug-1994 David Greenman <dg@FreeBSD.org>

Added $Id$


# df8bae1d 24-May-1994 Rodney W. Grimes <rgrimes@FreeBSD.org>

BSD 4.4 Lite Kernel Sources