History log of /freebsd-10.1-release/sys/sys/mount.h
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 272461 02-Oct-2014 gjb

Copy stable/10@r272459 to releng/10.1 as part of
the 10.1-RELEASE process.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 270892 31-Aug-2014 trasz

MFC r270096:

Bring in the new automounter, similar to what's provided in most other
UNIX systems, eg. MacOS X and Solaris. It uses Sun-compatible map format,
has proper kernel support, and LDAP integration.

There are still a few outstanding problems; they will be fixed shortly.

Reviewed by: allanjude@, emaste@, kib@, wblock@ (earlier versions)
Phabric: D523
Relnotes: yes
Sponsored by: The FreeBSD Foundation


# 270095 17-Aug-2014 kib

MFC r269457:
Remove Giant acquisition from the mount and unmount pathes.


# 259295 13-Dec-2013 kib

MFC r257904:
Hide MNT_SHARED_WRITES() and MNT_EXTENDED_SHARED() under the #ifdef
_KERNEL braces. Struct mount is only defined for the kernel build.


# 259294 13-Dec-2013 kib

MFC r257898:
Change VFS_PROLOGUE() to evaluate the mp once, convert
MNTK_SHARED_WRITES and MNTK_EXTENDED_SHARED tests into inline functions.


# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 255136 01-Sep-2013 rmacklem

Forced dismounts of NFS mounts can fail when thread(s) are stuck
waiting for an RPC reply from the server while holding the mount
point busy (mnt_lockref incremented). This happens because dounmount()
msleep()s waiting for mnt_lockref to become 0, before calling
VFS_UNMOUNT(). This patch adds a new VFS operation called VFS_PURGE(),
which the NFS client implements as purging RPCs in progress. Making
this call before checking mnt_lockref fixes the problem, by ensuring
that the VOP_xxx() calls will fail and unbusy the mount point.

Reported by: sbruno
Reviewed by: kib
MFC after: 2 weeks


# 254756 23-Aug-2013 alfred

Grow some spares in struct vfsops.

This should hopefully prevent ABI breakage
on adding new vfsops in 10.x.


# 251604 10-Jun-2013 marcel

Revert r251590. It unexpectedly broke the build and there were some
questions on locking. As part of commit-bit grooming, I'd like Steve
to handle this, but can't leave things broken in the mean time.


# 251590 09-Jun-2013 marcel

Add vfs_mounted and vfs_unmounted events so that components can be informed
about mount and unmount events. This is used by Juniper to implement a more
optimal implementation of NetBSD's veriexec.

Submitted by: stevek@juniper.net
Obtained from: Juniper Networks, Inc


# 250505 11-May-2013 kib

- Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). The
null_hashget() obtains the reference on the nullfs vnode, which must
be dropped.

- Fix a wart which existed from the introduction of the nullfs
caching, do not unlock lower vnode in the nullfs_reclaim_lowervp().
It should be innocent, but now it is also formally safe. Inform the
nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on
nullfs inode.

- Add a callback to the upper filesystems for the lower vnode
unlinking. When inactivating a nullfs vnode, check if the lower
vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC
on the lower vnode, and reclaim upper vnode if so. This allows
nullfs to purge cached vnodes for the unlinked lower vnode, avoiding
excessive caching.

Reported by: G??ran L??wkrantz <goran.lowkrantz@ismobile.com>
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks


# 248511 19-Mar-2013 kib

A flag for the filesystem to indicate to the upper levels that it accepts
unmapped buffers for the VOP_STRATEGY().

Sponsored by: The FreeBSD Foundation
Tested by: pho


# 247116 21-Feb-2013 jhb

Further refine the handling of stop signals in the NFS client. The
changes in r246417 were incomplete as they did not add explicit calls to
sigdeferstop() around all the places that previously passed SBDRY to
_sleep(). In addition, nfs_getcacheblk() could trigger a write RPC from
getblk() resulting in sigdeferstop() recursing. Rather than manually
deferring stop signals in specific places, change the VFS_*() and VOP_*()
methods to defer stop signals for filesystems which request this behavior
via a new VFCF_SBDRY flag. Note that this has to be a VFC flag rather than
a MNTK flag so that it works properly with VFS_MOUNT() when the mount is
not yet fully constructed. For now, only the NFS clients are set this new
flag in VFS_SET().

A few other related changes:
- Add an assertion to ensure that TDF_SBDRY doesn't leak to userland.
- When a lookup request uses VOP_READLINK() to follow a symlink, mark
the request as being on behalf of the thread performing the lookup
(cnp_thread) rather than using a NULL thread pointer. This causes
NFS to properly handle signals during this VOP on an interruptible
mount.

PR: kern/176179
Reported by: Russell Cattelan (sigdeferstop() recursion)
Reviewed by: kib
MFC after: 1 month


# 245894 24-Jan-2013 pluknet

Update and clarify comments regarding VFS op table initialization
in the man page and its header counterpart.

Submitted by: Christoph Mallon <christoph.mallon@gmx.de> (initial version)
Reviewed and further improved by: bde (previous version)
All bugs are: mine


# 245001 03-Jan-2013 kib

Remove the deprecated MNT_VNODE_FOREACH interface. Use the
MNT_VNODE_FOREACH_ALL instead.


# 244240 15-Dec-2012 kib

When mnt_vnode_next_active iterator cannot lock the next vnode and
yields, specify the user priority for the yield. Otherwise, a
higher-priority (kernel) thread could fall into the priority-inversion
with the thread owning the mutex lock.

On single-processor machines or UP kernels, do not loop adaptively
when the next vnode cannot be locked, instead yield unconditionally.

Restructure the iteration initializer and the iterator to remove code
duplication. Put the code to fetch and lock a vnode next to the
current marker, into the mnt_vnode_next_active() function, and use it
instead of repeating the loop.

Reported by: hrs, rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 days


# 244238 15-Dec-2012 kib

Line up the continuation backslashes.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days


# 242833 09-Nov-2012 attilio

Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag.
Porters should refer to __FreeBSD_version 1000021 for this change as
it may have happened at the same timeframe.


# 241896 22-Oct-2012 kib

Remove the support for using non-mpsafe filesystem modules.

In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.

The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.

Conducted and reviewed by: attilio
Tested by: pho


# 241225 05-Oct-2012 avg

mount.h: MNTK_VGONE_UPPER and MNTK_VGONE_WAITER were supposed to be different

... otherwise a waiter is never woken up.

Reported by: swills
Discussed with: jhb
Approved by: kib
MFC after: 3 days


# 240284 09-Sep-2012 kib

Add a facility for vgone() to inform the set of subscribed mounts
about vnode reclamation. Typical use is for the bypass mounts like
nullfs to get a notification about lower vnode going away.

Now, vgone() calls new VFS op vfs_reclaim_lowervp() with an argument
lowervp which is reclaimed. It is possible to register several
reclamation event listeners, to correctly handle the case of several
nullfs mounts over the same directory.

For the filesystem not having nullfs mounts over it, the overhead
added is a single mount interlock lock/unlock in the vnode reclamation
path.

In collaboration with: pho
MFC after: 3 weeks


# 240283 09-Sep-2012 kib

Add MNTK_LOOKUP_EXCL_DOTDOT struct mount flag, which specifies to the
lookup code that dotdot lookups shall override any shared lock
requests with the exclusive one. The flag is useful for filesystems
which sometimes need to upgrade shared lock to exclusive inside the
VOP_LOOKUP or later, which cannot be done safely for dotdot, due to
dvp also locked and causing LOR.

In collaboration with: pho
MFC after: 3 weeks


# 236321 30-May-2012 kib

vn_io_fault() is a facility to prevent page faults while filesystems
perform copyin/copyout of the file data into the usermode
buffer. Typical filesystem hold vnode lock and some buffer locks over
the VOP_READ() and VOP_WRITE() operations, and since page fault
handler may need to recurse into VFS to get the page content, a
deadlock is possible.

The facility works by disabling page faults handling for the current
thread and attempting to execute i/o while allowing uiomove() to
access the usermode mapping of the i/o buffer. If all buffer pages are
resident, uiomove() is successfull and request is finished. If EFAULT
is returned from uiomove(), the pages backing i/o buffer are faulted
in and held, and the copyin/out is performed using uiomove_fromphys()
over the held pages for the second attempt of VOP call.

Since pages are hold in chunks to prevent large i/o requests from
starving free pages pool, and since vnode lock is only taken for
i/o over the current chunk, the vnode lock no longer protect atomicity
of the whole i/o request. Use newly added rangelocks to provide the
required atomicity of i/o regardind other i/o and truncations.

Filesystems need to explicitely opt-in into the scheme, by setting the
MNTK_NO_IOPF struct mount flag, and optionally by using
vn_io_fault_uiomove(9) helper which takes care of calling uiomove() or
converting uio into request for uiomove_fromphys().

Reviewed by: bf (comments), mdf, pjd (previous version)
Tested by: pho
Tested by: flo, Gustau P?rez <gperez entel upc edu> (previous version)
MFC after: 2 months


# 235619 18-May-2012 mckusick

Update comment to document that the vnode free-list mutex needs to be
held when updating mnt_activevnodelist and mnt_activevnodelistsize.


# 234482 20-Apr-2012 mckusick

This change creates a new list of active vnodes associated with
a mount point. Active vnodes are those with a non-zero use or hold
count, e.g., those vnodes that are not on the free list. Note that
this list is in addition to the list of all the vnodes associated
with a mount point.

To avoid adding another set of linkage pointers to the vnode
structure, the active list uses the existing linkage pointers
used by the free list (previously named v_freelist, now renamed
v_actfreelist).

This update adds the MNT_VNODE_FOREACH_ACTIVE interface that loops
over just the active vnodes associated with a mount point (typically
less than 1% of the vnodes associated with the mount point).

Reviewed by: kib
Tested by: Peter Holm
MFC after: 2 weeks


# 234386 17-Apr-2012 mckusick

Replace the MNT_VNODE_FOREACH interface with MNT_VNODE_FOREACH_ALL.
The primary changes are that the user of the interface no longer
needs to manage the mount-mutex locking and that the vnode that
is returned has its mutex locked (thus avoiding the need to check
to see if its is DOOMED or other possible end of life senarios).

To minimize compatibility issues for third-party developers, the
old MNT_VNODE_FOREACH interface will remain available so that this
change can be MFC'ed to 9. Following the MFC to 9, MNT_VNODE_FOREACH
will be removed in head.

The reason for this update is to prepare for the addition of the
MNT_VNODE_FOREACH_ACTIVE interface that will loop over just the
active vnodes associated with a mount point (typically less than
1% of the vnodes associated with the mount point).

Reviewed by: kib
Tested by: Peter Holm
MFC after: 2 weeks


# 234157 11-Apr-2012 mckusick

Whitespace cleanup.


# 233999 07-Apr-2012 gleb

Add vfs_getopt_size. Support human readable file system options in tmpfs.

Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs.

Discussed with: delphij
MFC after: 2 weeks


# 232709 08-Mar-2012 kib

Decomission mnt_noasync. Introduce MNTK_NOASYNC mnt_kern_flag which
allows a filesystem to request VFS to not allow MNTK_ASYNC.

MFC after: 1 week


# 230249 16-Jan-2012 mckusick

Make sure all intermediate variables holding mount flags (mnt_flag)
and that all internal kernel calls passing mount flags are declared
as uint64_t so that flags in the top 32-bits are not lost.

MFC after: 2 weeks


# 224294 24-Jul-2011 mckusick

Move the MNTK_SUJ flag in mnt_kern_flag to MNT_SUJ in mnt_flag
so that it is visible to userland programs. This change enables
the `mount' command with no arguments to be able to show if a
filesystem is mounted using journaled soft updates as opposed
to just normal soft updates.

Approved by: re (bz)


# 224290 24-Jul-2011 mckusick

This update changes the mnt_flag field in the mount structure from
32 bits to 64 bits and eliminates the unused mnt_xflag field. The
existing mnt_flag field is completely out of bits, so this update
gives us room to expand. Note that the f_flags field in the statfs
structure is already 64 bits, so the expanded mnt_flag field can
be exported without having to make any changes in the statfs structure.

Approved by: re (bz)


# 222167 21-May-2011 rmacklem

Add a lock flags argument to the VFS_FHTOVP() file system
method, so that callers can indicate the minimum vnode
locking requirement. This will allow some file systems to choose
to return a LK_SHARED locked vnode when LK_SHARED is specified
for the flags argument. This patch only adds the flag. It
does not change any file system to use it and all callers
specify LK_EXCLUSIVE, so file system semantics are not changed.

Reviewed by: kib


# 216626 21-Dec-2010 pjd

Close body of the VFS_UNLOCK_GIANT() macro into do { } while (0) loop,
so it can be used in code like this:

if (cond)
VFS_UNLOCK_GIANT(vfslocked);
else
; /* Do something else. */

Before the change, compiler couldn't decide on its own if else should be
applied to the 'if (cond)' or to the if statement inside VFS_UNLOCK_GIANT()
macro.


# 213664 10-Oct-2010 kib

The r184588 changed the layout of struct export_args, causing an ABI
breakage for old mount(2) syscall, since most struct <filesystem>_args
embed export_args. The mount(2) is supposed to provide ABI
compatibility for pre-nmount mount(8) binaries, so restore ABI to
pre-r184588.

Requested and reviewed by: bde
MFC after: 2 weeks


# 212618 14-Sep-2010 kib

Rename the field to not confuse readers. The bytes are actually used.

Discussed with: rmacklem
MFC after: 1 week


# 212466 11-Sep-2010 kib

Protect mnt_syncer with the sync_mtx. This prevents a (rare) vnode leak
when mount and update are executed in parallel.

Encapsulate syncer vnode deallocation into the helper function
vfs_deallocate_syncvnode(), to not externalize sync_mtx from vfs_subr.c.

Found and reviewed by: jh (previous version of the patch)
Tested by: pho
MFC after: 3 weeks


# 211930 28-Aug-2010 pjd

There is a bug in vfs_allocate_syncvnode() failure handling in mount code.
Actually it is hard to properly handle such a failure, especially in MNT_UPDATE
case. The only reason for the vfs_allocate_syncvnode() function to fail is
getnewvnode() failure. Fortunately it is impossible for current implementation
of getnewvnode() to fail, so we can assert this and make
vfs_allocate_syncvnode() void. This in turn free us from handling its failures
in the mount code.

Reviewed by: kib
MFC after: 1 month


# 207141 24-Apr-2010 jeff

- Merge soft-updates journaling from projects/suj/head into head. This
brings in support for an optional intent log which eliminates the need
for background fsck on unclean shutdown.

Sponsored by: iXsystems, Yahoo!, and Juniper.
With help from: McKusick and Peter Holm


# 200796 21-Dec-2009 trasz

Implement NFSv4 ACL support for UFS.

Reviewed by: rwatson


# 195148 28-Jun-2009 stas

- Turn the third (islocked) argument of the knote call into flags parameter.
Introduce the new flag KNF_NOKQLOCK to allow event callers to be called
without KQ_LOCK mtx held.
- Modify VFS knote calls to always use KNF_NOKQLOCK flag. This is required
for ZFS as its getattr implementation may sleep.

Approved by: re (rwatson)
Reviewed by: kib
MFC after: 2 weeks


# 193762 08-Jun-2009 ps

Simply shared vnode locking and extend it to also include fsync.
Also, in vop_write, no longer assert for exclusive locks on the
vnode.

Reviewed by: jhb, kmacy, jeffr


# 193440 04-Jun-2009 ps

Support shared vnode locks for write operations when the offset is
provided on filesystems that support it. This really improves mysql
+ innodb performance on ZFS.

Reviewed by: jhb, kmacy, jeffr


# 193138 30-May-2009 attilio

Remove the now invalid (and possibly unused) debug.mpsafevfs
sysctl/tunable.

Reviewed by: emaste
Sponsored by: Sandvine Incorporated


# 193041 29-May-2009 trasz

There is only one spare MNT_ flag left, and I want to use it for NFSv4 ACLs.
Make room for additional filesystem flags now, to avoid breaking ABI later.

Reviewed by: kib@


# 191990 11-May-2009 attilio

Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS. Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled. Bump __FreeBSD_version in order to signal such
situation.


# 189696 11-Mar-2009 jhb

Add a new internal mount flag (MNTK_EXTENDED_SHARED) to indicate that a
filesystem supports additional operations using shared vnode locks.
Currently this is used to enable shared locks for open() and close() of
read-only file descriptors.
- When an ISOPEN namei() request is performed with LOCKSHARED, use a
shared vnode lock for the leaf vnode only if the mount point has the
extended shared flag set.
- Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but
not O_CREAT.
- Use a shared vnode lock around VOP_CLOSE() if the file was opened with
O_RDONLY and the mountpoint has the extended shared flag set.
- Adjust md(4) to upgrade the vnode lock on the vnode it gets back from
vn_open() since it now may only have a shared vnode lock.
- Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since
FIFO's require exclusive vnode locks for their open() and close()
routines. (My recent MPSAFE patches for UDF and cd9660 already included
this change.)
- Enable extended shared operations on UFS, cd9660, and UDF.

Submitted by: ups
Reviewed by: pjd (ZFS bits)
MFC after: 1 month


# 189290 02-Mar-2009 jamie

Extend the "vfsopt" mount options for more general use. Make struct
vfsopt and the vfs_buildopts function public, and add some new fields
to struct vfsopt (pos and seen), and new functions vfs_getopt_pos and
vfs_opterror.

Further extend the interface to allow reading options from the kernel
in addition to sending them to the kernel, with vfs_setopt and related
functions.

While this allows the "name=value" option interface to be used for more
than just FS mounts (planned use is for jails), it retains the current
"vfsopt" name and <sys/mount.h> requirement.

Approved by: bz (mentor)


# 188243 06-Feb-2009 trasz

Add KASSERTs to make it easier to debug problems like the one fixed
in r188141.

Reviewed by: kib,attilio
Approved by: rwatson (mentor)
Tested by: pho
Sponsored by: FreeBSD Foundation


# 186197 16-Dec-2008 attilio

1) Fix a deadlock in the VFS:
- threadA runs vfs_rel(mp1)
- threadB does unmount the mp1 fs, sets MNTK_UNMOUNT and drop MNT_ILOCK()
- threadA runs vfs_busy(mp1) and, as long as, MNTK_UNMOUNT is set, sleeps
waiting for threadB to complete the unmount
- threadB, in vfs_mount_destroy(), finds mnt_lock > 0 and sleeps waiting
for the refcount to expire.

Fix the deadlock by adding a flag called MNTK_REFEXPIRE which signals the
unmounter is waiting for mnt_ref to expire.
The vfs_busy contenders got awake, fails, and if they retry the
MNTK_REFEXPIRE won't allow them to sleep again.

2) Simplify significantly the code of vfs_mount_destroy() trimming
unnecessary codes:
- as long as any reference exited, it is no-more possible to have
write-op (primarty and secondary) in progress.
- it is no needed to drop and reacquire the mount lock.
- filling the structures with dummy values is unuseful as long as
it is going to be freed.

Tested by: pho, Andrea Barberio <insomniac at slackware dot it>
Discussed with: kib


# 185432 29-Nov-2008 kib

In the nfsrv_fhtovp(), after the vfs_getvfs() function found the pointer
to the fs, but before a vnode on the fs is locked, unmount may free fs
structures, causing access to destroyed data and freed memory.

Introduce a vfs_busymp() function that looks up and busies found
fs while mountlist_mtx is held. Use it in nfsrv_fhtovp() and in the
implementation of the handle syscalls.

Two other uses of the vfs_getvfs() in the vfs_subr.c, namely in
sysctl_vfs_ctl and vfs_getnewfsid seems to be ok. In particular,
sysctl_vfs_ctl is protected by Giant by being a non-sleeping sysctl
handler, that prevents Giant-locked unmount code to interfere with it.

Noted by: tegge
Reviewed by: dfr
Tested by: pho
MFC after: 1 month


# 185029 17-Nov-2008 pjd

Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes.

This bring huge amount of changes, I'll enumerate only user-visible changes:

- Delegated Administration

Allows regular users to perform ZFS operations, like file system
creation, snapshot creation, etc.

- L2ARC

Level 2 cache for ZFS - allows to use additional disks for cache.
Huge performance improvements mostly for random read of mostly
static content.

- slog

Allow to use additional disks for ZFS Intent Log to speed up
operations like fsync(2).

- vfs.zfs.super_owner

Allows regular users to perform privileged operations on files stored
on ZFS file systems owned by him. Very careful with this one.

- chflags(2)

Not all the flags are supported. This still needs work.

- ZFSBoot

Support to boot off of ZFS pool. Not finished, AFAIK.

Submitted by: dfr

- Snapshot properties

- New failure modes

Before if write requested failed, system paniced. Now one
can select from one of three failure modes:
- panic - panic on write error
- wait - wait for disk to reappear
- continue - serve read requests if possible, block write requests

- Refquota, refreservation properties

Just quota and reservation properties, but don't count space consumed
by children file systems, clones and snapshots.

- Sparse volumes

ZVOLs that don't reserve space in the pool.

- External attributes

Compatible with extattr(2).

- NFSv4-ACLs

Not sure about the status, might not be complete yet.

Submitted by: trasz

- Creation-time properties

- Regression tests for zpool(8) command.

Obtained from: OpenSolaris


# 184599 03-Nov-2008 attilio

Remove the mnt_holdcnt and mnt_holdcntwaiters because they are useless.
Really, the concept of holdcnt in the struct mount is rappresented by
the mnt_ref (which prevents the type-stable structure from being
"recycled) handled through vfs_ref() and vfs_rel().
On this optic, switch the holdcnt acquisition into an emulated vfs_ref()
(and subsequent release into vfs_rel()).

Discussed with: kib
Tested by: pho


# 184588 03-Nov-2008 dfr

Implement support for RPCSEC_GSS authentication to both the NFS client
and server. This replaces the RPC implementation of the NFS client and
server with the newer RPC implementation originally developed
(actually ported from the userland sunrpc code) to support the NFS
Lock Manager. I have tested this code extensively and I believe it is
stable and that performance is at least equal to the legacy RPC
implementation.

The NFS code currently contains support for both the new RPC
implementation and the older legacy implementation inherited from the
original NFS codebase. The default is to use the new implementation -
add the NFS_LEGACYRPC option to fall back to the old code. When I
merge this support back to RELENG_7, I will probably change this so
that users have to 'opt in' to get the new code.

To use RPCSEC_GSS on either client or server, you must build a kernel
which includes the KGSSAPI option and the crypto device. On the
userland side, you must build at least a new libc, mountd, mount_nfs
and gssd. You must install new versions of /etc/rc.d/gssd and
/etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf.

As long as gssd is running, you should be able to mount an NFS
filesystem from a server that requires RPCSEC_GSS authentication. The
mount itself can happen without any kerberos credentials but all
access to the filesystem will be denied unless the accessing user has
a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There
is currently no support for situations where the ticket file is in a
different place, such as when the user logged in via SSH and has
delegated credentials from that login. This restriction is also
present in Solaris and Linux. In theory, we could improve this in
future, possibly using Brooks Davis' implementation of variant
symlinks.

Supporting RPCSEC_GSS on a server is nearly as simple. You must create
service creds for the server in the form 'nfs/<fqdn>@<REALM>' and
install them in /etc/krb5.keytab. The standard heimdal utility ktutil
makes this fairly easy. After the service creds have been created, you
can add a '-sec=krb5' option to /etc/exports and restart both mountd
and nfsd.

The only other difference an administrator should notice is that nfsd
doesn't fork to create service threads any more. In normal operation,
there will be two nfsd processes, one in userland waiting for TCP
connections and one in the kernel handling requests. The latter
process will create as many kthreads as required - these should be
visible via 'top -H'. The code has some support for varying the number
of service threads according to load but initially at least, nfsd uses
a fixed number of threads according to the value supplied to its '-n'
option.

Sponsored by: Isilon Systems
MFC after: 1 month


# 184554 02-Nov-2008 attilio

Improve VFS locking:
- Implement real draining for vfs consumers by not relying on the
mnt_lock and using instead a refcount in order to keep track of lock
requesters.
- Due to the change above, remove the mnt_lock lockmgr because it is now
useless.
- Due to the change above, vfs_busy() is no more linked to a lockmgr.
Change so its KPI by removing the interlock argument and defining 2 new
flags for it: MBF_NOWAIT which basically replaces the LK_NOWAIT of the
old version (which was unlinked from the lockmgr alredy) and
MBF_MNTLSTLOCK which provides the ability to drop the mountlist_mtx
once the mnt interlock is held (ability still desired by most consumers).
- The stub used into vfs_mount_destroy(), that allows to override the
mnt_ref if running for more than 3 seconds, make it totally useless.
Remove it as it was thought to work into older versions.
If a problem of "refcount held never going away" should appear, we will
need to fix properly instead than trust on such hackish solution.
- Fix a bug where returning (with an error) from dounmount() was still
leaving the MNTK_MWAIT flag on even if it the waiters were actually
woken up. Just a place in vfs_mount_destroy() is left because it is
going to recycle the structure in any case, so it doesn't matter.
- Remove the markercnt refcount as it is useless.

This patch modifies VFS ABI and breaks KPI for vfs_busy() so manpages and
__FreeBSD_version will be modified accordingly.

Discussed with: kib
Tested by: pho


# 183188 19-Sep-2008 obrien

Add freebsd32 compat shim for nmount(2).
(and quiet some compiler warnings for vfs_donmount)


# 183073 16-Sep-2008 kib

When attempt is made to suspend a filesystem that is already syspended,
wait until the current suspension is lifted instead of silently returning
success immediately. The consequences of calling vfs_write() resume when
not owning the suspension are not well-defined at best.

Add the vfs_susp_clean() mount method to be called from
vfs_write_resume(). Set it to process_deferred_inactive() for ffs, and
stop calling it manually.

Add the thread flag TDP_IGNSUSP that allows to bypass the suspension
point in the vn_start_write. It is intended for use by VFS in the
situations where the suspender want to do some i/o requiring calls to
vn_start_write(), and this i/o cannot be done later.

Reviewed by: tegge
In collaboration with: pho
MFC after: 1 month


# 182542 31-Aug-2008 attilio

Decontextualize vfs_busy(), vfs_unbusy() and vfs_mount_alloc() functions.

Manpages are updated accordingly.

Tested by: Diego Sardina <siarodx at gmail dot com>


# 179670 09-Jun-2008 kib

Provide the mutual exclusion between the nfs export list modifications
and nfs requests processing. Lockmgr lock provides the shared locking for
nfs requests, while exclusive mode is used for modifications. The writer
starvation is handled by lockmgr too.

Reported by: kris, pho, many
Based on the submission by: mohan
Tested by: pho
MFC after: 2 weeks


# 178585 26-Apr-2008 pjd

Implement 'show mount' command in DDB. Without argument, it prints short
info about all currently mounted file systems. When an address is given
as an argument, prints detailed info about the given mount point.

MFC after: 2 weeks


# 176708 01-Mar-2008 attilio

- Handle buffer lock waiters count directly in the buffer cache instead
than rely on the lockmgr support [1]:
* bump the waiters only if the interlock is held
* let brelvp() return the waiters count
* rely on brelvp() instead than BUF_LOCKWAITERS() in order to check
for the waiters number
- Remove a namespace pollution introduced recently with lockmgr.h
including lock.h by including lock.h directly in the consumers and
making it mandatory for using lockmgr.
- Modify flags accepted by lockinit():
* introduce LK_NOPROFILE which disables lock profiling for the
specified lockmgr
* introduce LK_QUIET which disables ktr tracing for the specified
lockmgr [2]
* disallow LK_SLEEPFAIL and LK_NOWAIT to be passed there so that it
can only be used on a per-instance basis
- Remove BUF_LOCKWAITERS() and lockwaiters() as they are no longer
used

This patch breaks KPI so __FreBSD_version will be bumped and manpages
updated by further commits. Additively, 'struct buf' changes results in
a disturbed ABI also.

[2] Really, currently there is no ktr tracing in the lockmgr, but it
will be added soon.

[1] Submitted by: kib
Tested by: pho, Andrea Barberio <insomniac at slackware dot it>


# 172151 12-Sep-2007 kib

When restoring the mount after umount failed, the MNTK_UNMOUNT flag
prevents insmntque() from placing reallocated syncer vnode on mount
list, that causes panic in vfs_allocate_syncvnode().

Introduce MNTK_NOINSMNTQ flag, that marks the period when instmntque is
not allowed to success, instead of MNTK_UNMOUNT. The MNTK_NOINSMNTQ is
set and cleared simultaneously with MNTK_UNMOUNT, except on umount error
path, where it is cleaned just before the syncer vnode is going to be
allocated.

Reported by: Peter Jeremy <peterjeremy optushome com au>
Suggested by: tegge
Approved by: re (rwatson)


# 172003 28-Aug-2007 jhb

Rework the routines to convert a 5.x+ statfs structure (with fixed-size
64-bit counters) to a 4.x statfs structure (with long-sized counters).
- For block counters, we scale up the block size sufficiently large so
that the resulting block counts fit into a the long-sized (long for the
ABI, so 32-bit in freebsd32) counters. In 4.x the NFS client's statfs
VOP did this already. This can lie about the block size to 4.x binaries,
but it presents a more accurate picture of the ratios of free and
available space.
- For non-block counters, fix the freebsd32 stats converter to cap the
values at INT32_MAX rather than losing the upper 32-bits to match the
behavior of the 4.x statfs conversion routine in vfs_syscalls.c

Approved by: re (kensmith)


# 168954 22-Apr-2007 rwatson

In the MAC Framework implementation, file systems have two per-mountpoint
labels: the mount label (label of the mountpoint) and the fs label (label
of the file system). In practice, policies appear to only ever use one,
and the distinction is not helpful.

Combine mnt_mntlabel and mnt_fslabel into a single mnt_label, and
eliminate extra machinery required to maintain the additional label.
Update policies to reflect removal of extra entry points and label.

Obtained from: TrustedBSD Project
Sponsored by: SPARTA, Inc.


# 168823 17-Apr-2007 pjd

Export vfs_mount_alloc() as it is used in ZFS.


# 168396 05-Apr-2007 pjd

Add security.jail.mount_allowed sysctl, which allows to mount and
unmount jail-friendly file systems from within a jail.
Precisely it grants PRIV_VFS_MOUNT, PRIV_VFS_UNMOUNT and
PRIV_VFS_MOUNT_NONUSER privileges for a jailed super-user.
It is turned off by default.

A jail-friendly file system is a file system which driver registers
itself with VFCF_JAIL flag via VFS_SET(9) API.
The lsvfs(1) command can be used to see which file systems are
jail-friendly ones.

There currently no jail-friendly file systems, ZFS will be the first one.
In the future we may consider marking file systems like nullfs as
jail-friendly.

Reviewed by: rwatson


# 168216 01-Apr-2007 pjd

More style nits.


# 168212 01-Apr-2007 pjd

Style nit.


# 168185 31-Mar-2007 pjd

Make vfs_mount_destroy() and vfs_freeopts() non-static, I'd like to use them.


# 168020 29-Mar-2007 kib

Extend rev. 1.210 to avoid dereference NULL mp in VFS_NEEDSGIANT and
VFS_ASSERT_GIANT. Stop using reserved namespace.

Reported and tested by: kris
Reviewed and enhanced by: tegge
MFC after: 1 week


# 166795 16-Feb-2007 pjd

Remove VFS_VPTOFH entirely. API is already broken and it is good time to
do it.

Suggested by: rwatson


# 166774 15-Feb-2007 pjd

Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method.
This way we may support multiple structures in v_data vnode field within
one file system without using black magic.

Vnode-to-file-handle should be VOP in the first place, but was made VFS
operation to keep interface as compatible as possible with SUN's VFS.
BTW. Now Solaris also implements vnode-to-file-handle as VOP operation.

VFS_VPTOFH() was left for API backward compatibility, but is marked for
removal before 8.0-RELEASE.

Approved by: mckusick
Discussed with: many (on IRC)
Tested with: ufs, msdosfs, cd9660, nullfs and zfs


# 165288 16-Dec-2006 rodrigc

Add a function vfs_deleteopt() which searches through the vfsoptlist
linked list of mount options by name, and deletes the option if it finds it.


# 163839 31-Oct-2006 pjd

Add MNT_GJOURNAL flag which indicates, that file system has gjournal
support enabled.
Add mnt_gjprovider field which keeps gjournal provider's name on which
file system is placed on. This allows to not place file system on gjournal
directly and allows gjournal class to pair gjournal provider with file
system.

Sponsored by: home.pl


# 162983 03-Oct-2006 kib

Fix the remaining race in the revs. 1.232, 1,233 that could occur during
unmount when mp structure is reused while waiting for coveredvp lock.
Introduce struct mount generation count, increment it on each reuse and
compare the generations before and after obtaining the coveredvp lock.

Reviewed by: tegge, pjd
Approved by: pjd (mentor)
MFC after: 2 weeks


# 162650 26-Sep-2006 tegge

Increase mnt_noasync once in softdep_mount() to disallow async io,
closing a window where a file system using softupdates could be async
for a short while if both MNT_UPDATE and MNT_ASYNC were passed as flags
to nmount(). Add MNTK_SOFTDEP flag to ensure that softdep_mount()
doesn't increase mnt_noasync multiple times.


# 162649 26-Sep-2006 tegge

Add mnt_noasync counter to better handle interleaved calls to nmount(),
sync() and sync_fsync() without losing MNT_ASYNC. Add MNTK_ASYNC flag
which is set only when MNT_ASYNC is set and mnt_noasync is zero, and
check that flag instead of MNT_ASYNC before initiating async io.


# 162647 26-Sep-2006 tegge

Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag.
This eliminates a race where MNT_UPDATE flag could be lost when nmount()
raced against sync(), sync_fsync() or quotactl().


# 162288 13-Sep-2006 mohans

Fixes up the handling of shared vnode lock lookups in the NFS client,
adds a FS type specific flag indicating that the FS supports shared
vnode lock lookups, adds some logic in vfs_lookup.c to test this flag
and set lock flags appropriately.

- amd on 6.x is a non-starter (without this change). Using amd under
heavy load results in a deadlock (with cascading vnode locks all the
way to the root) very quickly.
- This change should also fix the more general problem of cascading
vnode deadlocks when an NFS server goes down.

Ideally, we wouldn't need these changes, as enabling shared vnode lock
lookups globally would work. Unfortunately, UFS, for example isn't
ready for shared vnode lock lookups, crashing pretty quickly.

This change is the result of discussions with Stephan Uphoff (ups@).

Reviewed by: ups@


# 158320 05-May-2006 tegge

Avoid dereferencing NULL pointer.


# 157321 31-Mar-2006 jeff

- Define mnt_startzero and mnt_endzero as a range that excludes mnt_mtx
and mnt_lock so that the mountpoint can be explicitly zeroed on
creation.

Discussed with: tegge
Tested by: kris
Sponsored by: Isilon Systems, Inc.


# 156560 10-Mar-2006 tegge

Block secondary writes while expunging active unlinked files.

Fix detection of active unlinked files by checking VI_OWEINACT and
VI_DOINGINACT in addition to v_usecount.

Defer inactive handling for unlinked files if the file system is mostly
suspended (secondary writes being blocked).

Perform deferred inactive handling after the file system is resumed.


# 156451 08-Mar-2006 tegge

Use vn_start_secondary_write() and vn_finished_secondary_write() as a
replacement for vn_write_suspend_wait() to better account for secondary write
processing.

Close race where secondary writes could be started after ffs_sync() returned
but before the file system was marked as suspended.

Detect if secondary writes or softdep processing occurred during vnode sync
loop in ffs_sync() and retry the loop if needed.


# 156203 02-Mar-2006 jeff

- Move softdep from using a global worklist to per-mount worklists. This
has many positive effects including improved smp locking, reducing
interdependencies between mounts that can lead to deadlocks, etc.
- Add the softdep worklist and various counters to the ufsmnt structure.
- Add a mount pointer to the workitem and remove mount pointers from the
various structures derived from the workitem as they are now redundant.
- Remove the poor-man's semaphore protecting softdep_process_worklist and
softdep_flushworklist. Several threads may now process the list
simultaneously.
- Add softdep_waitidle() to block the thread until all pending
dependencies being operated on by other threads have been flushed.
- Use softdep_waitidle() in unmount and snapshots to block either
operation until the fs is stable.
- Remove softdep worklist processing from the syncer and move it into the
softdep_flush() thread. This thread processes all softdep mounts
once each second and when it is called via the new softdep_speedup()
when there is a resource shortage. This removes the softdep hook
from the kernel and various hacks in header files to support it.

Reviewed by/Discussed with: tegge, truckman, mckusick
Tested by: kris


# 155386 06-Feb-2006 jeff

- Add a ref count to the mount structure. Sleep for up to 3 seconds in
vfs_mount_destroy waiting for this ref to hit 0. We don't print an
error if we are rebooting as the root mount always retains some refernces
by init proc.
- Acquire a mnt ref for every vnode allocated to a mount point. Drop this
ref only once vdestroy() has been called and the mount has been freed.
- No longer NULL the v_mount pointer in delmntque() so that we may release
the ref after vgone() has been called. This allows us to guarantee
that the mount point structure will be valid until the last vnode has
lost its last ref.
- Fix a few places that rely on checking v_mount to detect recycling.

Sponsored by: Isilon Systems, Inc.
MFC After: 1 week


# 154152 09-Jan-2006 tegge

Add marker vnodes to ensure that all vnodes associated with the mount point are
iterated over when using MNT_VNODE_FOREACH.

Reviewed by: truckman


# 153523 19-Dec-2005 pjd

- Document another spare flag (0x00000010).
- Add a 'XXX' comment about MNT_ACLS and MNT_BYFSID flags collision and
explain why it is harmless.
- Add a colon after 'XXX' for consistency.


# 153400 13-Dec-2005 des

Eradicate caddr_t from the VFS API.


# 152912 28-Nov-2005 rodrigc

Remove MNT_NODEV mount option. In RELENG_6, MNT_NODEV was a no-op.
The presence of MNT_NODEV was confusing the am-utils autoconf scripts.

PR: conf/79715


# 152176 08-Nov-2005 rodrigc

Add utility function to propagate mount errors as text string messages.

Discussed with: phk


# 151588 23-Oct-2005 pjd

MNT_JAILDEVFS is not used anymore. Mark it as spare.

OK'ed by: phk


# 148768 05-Aug-2005 ssouhlal

Holding a vnode doesn't prevent v_mount from disappearing (when the
vnode is inactivated), possibly leading to a NULL dereference when
checking if the mount wants knotes to be activated in the VOP hooks.
So, we add a new vnode flag VV_NOKNOTE that is only set in getnewvnode(),
if necessary, and check it when activating knotes.
Since the flags are not erased when a vnode is being held, we can safely
read them.

Reviewed by: kris@
MFC after: 3 days


# 147730 01-Jul-2005 ssouhlal

Fix the recent panics/LORs/hangs created by my kqueue commit by:

- Introducing the possibility of using locks different than mutexes
for the knlist locking. In order to do this, we add three arguments to
knlist_init() to specify the functions to use to lock, unlock and
check if the lock is owned. If these arguments are NULL, we assume
mtx_lock, mtx_unlock and mtx_owned, respectively.

- Using the vnode lock for the knlist locking, when doing kqueue operations
on a vnode. This way, we don't have to lock the vnode while holding a
mutex, in filt_vfsread.

Reviewed by: jmg
Approved by: re (scottl), scottl (mentor override)
Pointyhat to: ssouhlal
Will be happy: everyone


# 147198 09-Jun-2005 ssouhlal

Allow EVFILT_VNODE events to work on every filesystem type, not just
UFS by:
- Making the pre and post hooks for the VOP functions work even when
DEBUG_VFS_LOCKS is not defined.
- Moving the KNOTE activations into the corresponding VOP hooks.
- Creating a MNTK_NOKNOTE flag for the mnt_kern_flag field of struct
mount that permits filesystems to disable the new behavior.
- Creating a default VOP_KQFILTER function: vfs_kqfilter()

My benchmarks have not revealed any performance degradation.

Reviewed by: jeff, bde
Approved by: rwatson, jmg (kqueue changes), grehan (mentor)


# 144053 24-Mar-2005 jeff

- Add a 'flags' parameter to VFS_ROOT(). This is intended to allow
lookup to do shared locks on the root. Filesystems are free to ignore
flags and instead acquire an exclusive lock if they do not support
shared locks.

Sponsored by: Isilon Systems, Inc.


# 143680 16-Mar-2005 phk

Add mnt_hashseed to struct mount and initialize it witn PRNG bits, use
it to get better hashing in vfs_hash.

In case of an insert collision in vfs_hash_insert(), put the loosing vnode
on a special list so that vfs_hash_remove() can just assume that it is on
a list.

Drop the VI_HASHED flag.


# 142153 20-Feb-2005 das

Remove VFS_START(). Its original purpose involved the mfs filesystem,
which is long gone.

Discussed with: mckusick
Reviewed by: phk


# 141634 10-Feb-2005 phk

Make various mountpoint related functions static.


# 140699 24-Jan-2005 jeff

Force commit to note the sponsor of the VFS smp work:

Sponsored By: Isilon Systems, Inc.


# 140697 24-Jan-2005 jeff

- Add the mount flag MNTK_MPSAFE which indicates whether or not Giant
must be held when any vnode owned by the filesystem is manipulated.
- Add VFS_LOCK_GIANT and VFS_UNLOCK_GIANT macros which are used to
conditionally lock and unlock Giant based on a particular mountpoint.


# 140048 11-Jan-2005 phk

Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC().

I'm not sure why a credential was added to these in the first place, it is
not used anywhere and it doesn't make much sense:

The credentials for syncing a file (ability to write to the
file) should be checked at the system call level.

Credentials for syncing one or more filesystems ("none")
should be checked at the system call level as well.

If the filesystem implementation needs a particular credential
to carry out the syncing it would logically have to the
cached mount credential, or a credential cached along with
any delayed write data.

Discussed with: rwatson


# 139825 07-Jan-2005 imp

/* -> /*- for license, minor formatting changes


# 138509 07-Dec-2004 phk

The remaining part of nmount/omount/rootfs mount changes. I cannot sensibly
split the conversion of the remaining three filesystems out from the root
mounting changes, so in one go:

cd9660:
Convert to nmount.
Add omount compat shims.
Remove dedicated rootfs mounting code.
Use vfs_mountedfrom()
Rely on vfs_mount.c calling VFS_STATFS()

nfs(client):
Convert to nmount (the simple way, mount_nfs(8) is still necessary).
Add omount compat shims.
Drop COMPAT_PRELITE2 mount arg compatibility.

ffs:
Convert to nmount.
Add omount compat shims.
Remove dedicated rootfs mounting code.
Use vfs_mountedfrom()
Rely on vfs_mount.c calling VFS_STATFS()

Remove vfs_omount() method, all filesystems are now converted.

Remove MNTK_WANTRDWR, handling RO/RW conversions is a filesystem
task, and they all do it now.

Change rootmounting to use DEVFS trampoline:

vfs_mount.c:
Mount devfs on /. Devfs needs no 'from' so this is clean.
symlink /dev to /. This makes it possible to lookup /dev/foo.
Mount "real" root filesystem on /.
Surgically move the devfs mountpoint from under the real root
filesystem onto /dev in the real root filesystem.

Remove now unnecessary getdiskbyname().

kern_init.c:
Don't do devfs mounting and rootvnode assignment here, it was
already handled by vfs_mount.c.

Remove now unused bdevvp(), addaliasu() and addalias(). Put the
few necessary lines in devfs where they belong. This eliminates the
second-last source of bogo vnodes, leaving only the lemming-syncer.

Remove rootdev variable, it doesn't give meaning in a global context and
was not trustworth anyway. Correct information is provided by
statfs(/).


# 138467 06-Dec-2004 phk

Add more functions for handling mount arguments in VFS_MOUNT():

vfs_flagopt() for binary/boolean options.
vfs_getopts() for string options
vfs_filteropt() to check for unknown options.
vfs_scanopt() for scanf() like processing of options.

Also add function for setting the stat.f_mntfromname field.


# 138461 06-Dec-2004 phk

Change the first argument of vfs_cmount() to a handy struct mntarg* and
call it accordingly.

(No filesystems implement vfs_cmount() yet, so this is a no-op commit)


# 138448 06-Dec-2004 phk

Add a few convenient functions in the mount_arg() family and collect the
entire family at the end of the source file.


# 138444 06-Dec-2004 phk

Make struct vfsopt{list} private to vfs_mount.c


# 138412 05-Dec-2004 phk

VFS_STATFS(mp, ...) is mostly called with &mp->mnt_stat, but a few cases
doesn't. Most of the implementations have grown weeds for this so they
copy some fields from mnt_stat if the passed argument isn't that.

Fix this the cleaner way: Always call the implementation on mnt_stat
and copy that in toto to the VFS_STATFS argument if different.


# 138363 03-Dec-2004 phk

Implement a function, mount_arg() for accumulating a list of mount parameters
to nmount.

Make kernel_mount() accept the output from mount_arg() and know how to
free the malloc'ed space.

Make kernel_vmount() use the new function.


# 138358 03-Dec-2004 phk

Add vfs_cmount() method to vfs_ops, this is to convert old-style mount
args to nmount request.


# 138356 03-Dec-2004 phk

Retire unused vfs_mount() function in the name of nmount migration.


# 138350 03-Dec-2004 phk

Introduce vfs_byname_kld() which will try to load the filesystem
as a module if possible.

Use it so we don't have linker magic in the middle of the already
complex mount code.


# 138121 26-Nov-2004 phk

Eliminate MNT_NODEV usage, it doesn't have any meaning any more.

Keep a #define MNT_NODEV 0 around to avoid dealing with contrib
userland like mount_smbfs.


# 138077 25-Nov-2004 phk

Integrate the relevant bits of vfs_rootmountalloc() where it matters.


# 137050 29-Oct-2004 phk

Loose vfs_mountedon()


# 133132 04-Aug-2004 maxim

o Fix a typo in the comment.


# 132902 30-Jul-2004 phk

Put a version element in the VFS filesystem configuration structure
and refuse initializing filesystems with a wrong version. This will
aid maintenance activites on the 5-stable branch.

s/vfs_mount/vfs_omount/

s/vfs_nmount/vfs_mount/

Name our filesystems mount function consistently.

Eliminate the namiedata argument to both vfs_mount and vfs_omount.
It was originally there to save stack space. A few places abused
it to get hold of some credentials to pass around. Effectively
it is unused.

Reorganize the root filesystem selection code.


# 132710 27-Jul-2004 phk

Convert the vfsconf list to a TAILQ.

Introduce vfs_byname() function to find things on it.

Staticize vfs_nmount() function under the name vfs_donmount().

Various cleanups.


# 132320 17-Jul-2004 alfred

Fix macro so that we don't get missing initializer warnings.


# 132023 12-Jul-2004 alfred

Make VFS_ROOT() and vflush() take a thread argument.
This is to allow filesystems to decide based on the passed thread
which vnode to return.
Several filesystems used curthread, they now use the passed thread.


# 131780 08-Jul-2004 alfred

struct mount->mnt_data has been a qaddr_t since '94 (rev 1.1),
It should be a void *, fix it.


# 131733 07-Jul-2004 alfred

do the vfsstd thing instead of messing up our VFS_SYSCTL macro.


# 131695 06-Jul-2004 alfred

Introduce vfs_suser(), used to test if a user should have special privs
for a mount.


# 131691 06-Jul-2004 alfred

NFS mobility PHASE I, II & III (phase VI, and V pending):

Rebind the client socket when we experience a timeout. This fixes
the case where our IP changes for some reason.

Signal a VFS event when NFS transitions from up to down and vice
versa.

Add a placeholder vfs_sysctl where we will put status reporting
shortly.

Also:
Make down NFS mounts return EIO instead of EINTR when there is a
soft timeout or force unmount in progress.


# 131593 04-Jul-2004 alfred

Pass the operation in with the fsidctl.
Remove some fsidctls that we will not be using.
Correct prototypes for fs sysctls.


# 131562 04-Jul-2004 alfred

Introduce a new kevent filter. EVFILT_FS that will be used to signal
generic filesystem events to userspace. Currently only mount and unmount
of filesystems are signalled. Soon to be added, up/down status of NFS.

Introduce a sysctl node used to route requests to/from filesystems
based on filesystem ids.

Introduce a new vfsop, vfs_sysctl(mp, req) that is used as the callback/
entrypoint by the sysctl code to change individual filesystems.


# 131551 04-Jul-2004 phk

When we traverse the vnodes on a mountpoint we need to look out for
our cached 'next vnode' being removed from this mountpoint. If we
find that it was recycled, we restart our traversal from the start
of the list.

Code to do that is in all local disk filesystems (and a few other
places) and looks roughly like this:

MNT_ILOCK(mp);
loop:
for (vp = TAILQ_FIRST(&mp...);
(vp = nvp) != NULL;
nvp = TAILQ_NEXT(vp,...)) {
if (vp->v_mount != mp)
goto loop;
MNT_IUNLOCK(mp);
...
MNT_ILOCK(mp);
}
MNT_IUNLOCK(mp);

The code which takes vnodes off a mountpoint looks like this:

MNT_ILOCK(vp->v_mount);
...
TAILQ_REMOVE(&vp->v_mount->mnt_nvnodelist, vp, v_nmntvnodes);
...
MNT_IUNLOCK(vp->v_mount);
...
vp->v_mount = something;

(Take a moment and try to spot the locking error before you read on.)

On a SMP system, one CPU could have removed nvp from our mountlist
but not yet gotten to assign a new value to vp->v_mount while another
CPU simultaneously get to the top of the traversal loop where it
finds that (vp->v_mount != mp) is not true despite the fact that
the vnode has indeed been removed from our mountpoint.

Fix:

Introduce the macro MNT_VNODE_FOREACH() to traverse the list of
vnodes on a mountpoint while taking into account that vnodes may
be removed from the list as we go. This saves approx 65 lines of
duplicated code.

Split the insmntque() which potentially moves a vnode from one mount
point to another into delmntque() and insmntque() which does just
what the names say.

Fix delmntque() to set vp->v_mount to NULL while holding the
mountpoint lock.


# 130585 16-Jun-2004 phk

Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.


# 128143 11-Apr-2004 mux

Belatedly remove the getvfsent(3) API. All the consumers have been
updated to use getvfsbyname(3) or the vfs.conflist sysctl since a
long time, except mount_smbfs(8) which has just been fixed.


# 128142 11-Apr-2004 mux

Put struct ovfsconf inside BURN_BRIDGES as well.


# 127976 07-Apr-2004 imp

Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core


# 127932 06-Apr-2004 bde

Oops, fixed insertion sort error in the fix for an insertion sort error.

While here, begin fixing dependencies of <sys/mount.h> on normal namespace
pollution (__BSD_VISIBLE) by not using u_int in the prototype for nmount(2),
although it is used in the man page.

While there, begin cleaning up another set of prototypes:
- use u_int in the prototype for the kernel part of nmount().
- consistently don't use parameter names in prototypes in the
"exported vnode operations" set of prototypes, although style(9) says to
use names in the kernel.


# 127928 06-Apr-2004 bde

Fixed unsorting of prototypes in previous commit and 1.134.


# 127891 05-Apr-2004 dfr

Regen.


# 127058 16-Mar-2004 tjr

Make vfs_nmount() public. The Linux emulator needs this in order to mount
linprocfs filesystems.


# 126852 11-Mar-2004 phk

Remove unused mnt_reservedvnlist field.


# 125338 02-Feb-2004 pjd

Added flag MNT_USER to MNT_UPDATEMASK, it will be used for detecting
file systems mounted by unprivileged users.

Reviewed by: rwatson
Approved by: scottl (mentor)
MFC after: 3 days


# 122537 12-Nov-2003 mckusick

Update the statfs structure with 64-bit fields to allow
accurate reporting of multi-terabyte filesystem sizes.

You should build and boot a new kernel BEFORE doing a `make world'
as the new kernel will know about binaries using the old statfs
structure, but an old kernel will not know about the new system
calls that support the new statfs structure. Running an old kernel
after a `make world' will cause programs such as `df' that do a
statfs system call to fail with a bad system call.

Reviewed by: Bruce Evans <bde@zeta.org.au>
Reviewed by: Tim Robbins <tjr@freebsd.org>
Reviewed by: Julian Elischer <julian@elischer.org>
Reviewed by: the hoards of <arch@freebsd.org>
Sponsored by: DARPA & NAI Labs.


# 122524 12-Nov-2003 rwatson

Modify the MAC Framework so that instead of embedding a (struct label)
in various kernel objects to represent security data, we embed a
(struct label *) pointer, which now references labels allocated using
a UMA zone (mac_label.c). This allows the size and shape of struct
label to be varied without changing the size and shape of these kernel
objects, which become part of the frozen ABI with 5-STABLE. This opens
the door for boot-time selection of the number of label slots, and hence
changes to the bound on the number of simultaneous labeled policies
at boot-time instead of compile-time. This also makes it easier to
embed label references in new objects as required for locking/caching
with fine-grained network stack locking, such as inpcb structures.

This change also moves us further in the direction of hiding the
structure of kernel objects from MAC policy modules, not to mention
dramatically reducing the number of '&' symbols appearing in both the
MAC Framework and MAC policy modules, and improving readability.

While this results in minimal performance change with MAC enabled, it
will observably shrink the size of a number of critical kernel data
structures for the !MAC case, and should have a small (but measurable)
performance benefit (i.e., struct vnode, struct socket) do to memory
conservation and reduced cost of zeroing memory.

NOTE: Users of MAC must recompile their kernel and all MAC modules as a
result of this change. Because this is an API change, third party
MAC modules will also need to be updated to make less use of the '&'
symbol.

Suggestions from: bmilekic
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories


# 122523 12-Nov-2003 kan

1. Consolidate mount struct allocation/destruction into a common code in
vfs_mount_alloc/vfs_mount_destroy functions and take care to completely
destroy the mount point along with its locks. Mount struct has grown in
coplexity recently and depending on each failure path to destroy it
completely isn't working anymore.

2. Eliminate largely identical vfs_mount and vfs_unmount question by
moving the code to handle both cases into a newly introduced vfs_domount
function.

3. Simplify nfs_mount_diskless to always expect an allocated mount
struct and never attempt an allocation/destruction itself. The
vfs_allocroot allocation was there to support 'magic' swap space
configuration for diskless clients that was already removed by PHK some
time ago.

4. Include a vfs_buildopts cleanups by Peter Edwards to validate the
sanity of nmount parameters passed from userland.

Submitted by: (4) Peter Edwards <peter.edwards@openet-telecom.com>
Reviewed by: rwatson


# 122091 05-Nov-2003 kan

Remove mntvnode_mtx and replace it with per-mountpoint mutex.
Introduce two new macros MNT_ILOCK(mp)/MNT_IUNLOCK(mp) to
operate on this mutex transparently.

Eventually new mutex will be protecting more fields in
struct mount, not only vnode list.

Discussed with: jeff


# 117132 01-Jul-2003 iedowse

Add a new mount flag MNT_BYFSID that can be used to unmount a file
system by specifying the file system ID instead of a path. Use this
by default in umount(8). This avoids the need to perform any vnode
operations to look up the mount point, so it makes it possible to
unmount a file system whose root vnode cannot be looked up (e.g.
due to a dead NFS server, or a file system that has become detached
from the hierarchy because an underlying file system was unmounted).
It also provides an unambiguous way to specify which file system is
to be unmunted.

Since the ability to unmount using a path name is retained only for
compatibility, that case now just uses a simple string comparison
of the supplied path against f_mntonname of each mounted file system.

Discussed on: freebsd-arch
mdoc help from: ru


# 112693 26-Mar-2003 tegge

Adjust the number of vnodes scanned by vlrureclaim() according to the
size of the vnode list.


# 112119 11-Mar-2003 kan

Rename vfs_stdsync function to vfs_stdnosync which matches more
closely what function is really doing. Update all existing consumers
to use the new name.

Introduce a new vfs_stdsync function, which iterates over mount
point's vnodes and call FSYNC on each one of them in turn.

Make nwfs and smbfs use this new function instead of rolling their
own identical sync implementations.

Reviewed by: jeff


# 112067 10-Mar-2003 kan

Remove trainling whitespace.


# 108330 27-Dec-2002 rwatson

Re-add MNT_ACLS to the list of "updateable" mount flags, per our
documentation. Generally, you really shouldn't twiddle the flag,
but there are sensible scenarios where one might.

Obtained from: TrustedBSD Project


# 106582 07-Nov-2002 mux

A bunch of style(9) fixes.

Obtained from: bde


# 106576 07-Nov-2002 mux

- Use a better definition for MNAMELEN which doesn't require
to have one #ifdef per architecture.
- Change a space to a tab after a nearby #define.

Obtained from: bde


# 105113 14-Oct-2002 rwatson

Define MNT_ACLS, which can report on the status of the FS_ACLS flag
used by UFS to administratively enable support for extended ACLs.

While I'm here, remove MNT_MULTILABEL from the list of file system
flags we permit to be updated after the initial mount.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories


# 102088 19-Aug-2002 phk

Keep a copy of the credential used to mount filesystems around so
we can check and use it later on.

Change the pieces of code which relied on mount->mnt_stat.f_owner
to check which user mounted the filesystem.

This became needed as the EA code needs to be able to allocate
blocks for "system" EA users like ACLs.

There seems to be some half-baked (probably only quarter- actually)
notion that the superuser for a given filesystem is the user who
mounted it, but this has far from been carried through. It is
unclear if it should be.

Sponsored by: DARPA & NAI Labs.


# 101848 13-Aug-2002 rwatson

Move to a nested include of _label.h instead of mac.h in sys/sys/*.h
(Most of the places where mac.h was recursively included from another
kernel header file. net/netinet to follow.)

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
Suggested by: bde


# 101832 13-Aug-2002 mux

Forward define struct iovec instead of including
sys/uio.h and polluting the namespace even more.


# 101777 13-Aug-2002 phk

Introduce typedefs for the member functions of struct vfsops and employ
these in the main filesystems. This does not change the resulting code
but makes the source a little bit more grepable.

Sponsored by: DARPA and NAI Labs.


# 101660 11-Aug-2002 mux

Don't #ifdef _KERNEL struct vfsconf, mount_smbfs(8)
still uses it.

Submitted by: jake


# 101659 10-Aug-2002 mux

One declaration for struct xvfsconf is enough. I have
no idea how this happened. :-)

Reported by: Norman C. Rice <nrice@emu.sourcee.com>


# 101651 10-Aug-2002 mux

- Introduce a new struct xvfsconf, the userland version of struct vfsconf.
- Make getvfsbyname() take a struct xvfsconf *.
- Convert several consumers of getvfsbyname() to use struct xvfsconf.
- Correct the getvfsbyname.3 manpage.
- Create a new vfs.conflist sysctl to dump all the struct xvfsconf in the
kernel, and rewrite getvfsbyname() to use this instead of the weird
existing API.
- Convert some {set,get,end}vfsent() consumers to use the new vfs.conflist
sysctl.
- Convert a vfsload() call in nfsiod.c to kldload() and remove the useless
vfsisloadable() and endvfsent() calls.
- Add a warning printf() in vfs_sysctl() to tell people they are using
an old userland.

After these changes, it's possible to modify struct vfsconf without
breaking the binary compatibility. Please note that these changes don't
break this compatibility either.

When bp will have updated mount_smbfs(8) with the patch I sent him, there
will be no more consumers of the {set,get,end}vfsent(), vfsisloadable()
and vfsload() API, and I will promptly delete it.


# 100985 30-Jul-2002 rwatson

Begin committing support for Mandatory Access Control and extensible
kernel access control. The MAC framework permits loadable kernel
modules to link to the kernel at compile-time, boot-time, or run-time,
and augment the system security policy. This commit includes the
initial kernel implementation, although the interface with the userland
components of the oeprating system is still under work, and not all
kernel subsystems are supported. Later in this commit sequence,
documentation of which kernel subsystems will not work correctly with
a kernel compiled with MAC support will be added.

Label file system mount points, permitting security information to be
maintained at the granularity of the file system. Two labels are
currently maintained: a security label for the mount itself, and
a default label for objects in the file system (in particular, for
file systems not supporting per-vnode labeling directly).

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


# 99696 09-Jul-2002 mux

Remove vfs_stdmount() and vfs_stdunmount(). They are not
really useful and are incompatible with nmount.


# 99336 03-Jul-2002 mux

Remove an unused argument in vfs_mountroot().


# 99264 02-Jul-2002 mux

Move every code related to mount(2) in a new file, vfs_mount.c.
The file vfs_conf.c which was dealing with root mounting has
been repo-copied into vfs_mount.c to preserve history.
This makes nmount related development easier, and help reducing
the size of vfs_syscalls.c, which is still an enormous file.

Reviewed by: rwatson
Repo-copy by: peter


# 98625 22-Jun-2002 mux

o Remove the initialization of unused fields in the struct
uio now that we don't use uiomove() anymore.
o Enforce stricter checks on the length of the iov's in
nmount(2) since we now malloc() them individually and
corrupted iov's could make the kernel crash in malloc()
with "kmem_map too small".

Reviewed by: phk


# 98510 20-Jun-2002 mux

Change the way we internally store the mount options to
a linked list. This is to allow the merging of the mount
options in the MNT_UPDATE case, as the current data structure
is unsuitable for this.

There are no functional differences in this commit.

Reviewed by: phk


# 98233 14-Jun-2002 mux

Change vfs_copyopt() so that the length argument passed to it
must be the exact same size as the mount option. This makes
vfs_copyopt() much more useful.


# 97188 23-May-2002 mux

Update comments to better match reality.


# 96755 16-May-2002 trhodes

More s/file system/filesystem/g


# 94903 16-Apr-2002 iedowse

The recent NFS forced unmount improvements introduced a side-effect
where some client operations might be unexpectedly cancelled during
an unsuccessful non-forced unmount attempt. This causes problems
for amd(8), because it periodically attempts a non-forced unmount
to check if the filesystem is still in use.

Fix this by adding a new mountpoint flag MNTK_UNMOUNTF that is set
only during the operation of a forced unmount. Use this instead of
MNTK_UNMOUNT to trigger the cancellation of hung NFS operations.

Also correct a problem where dounmount() might inadvertently clear
the MNTK_UNMOUNT flag.

Reported by: simokawa
MFC after: 1 week


# 93230 26-Mar-2002 mux

Commit the good prototype for nmount(2).

Reviewed by: phk


# 93228 26-Mar-2002 mux

As discussed in -arch, add the new nmount(2) system call and the
new vfs_getopt()/vfs_copyopt() API. This is intended to be used
later, when there will be filesystems implementing the VFS_NMOUNT
operation. The mount(2) system call will disappear when all
filesystems will be converted to the new API. Documentation will
be committed in a while.

Reviewed by: phk


# 93008 23-Mar-2002 bde

Fixed some style bugs in the removal of __P(()). The main ones were
not removing tabs before "__P((", and not outdenting continuation lines
to preserve non-KNF lining up of code with parentheses. Switch to KNF
formatting and/or rewrap the whole prototype in some cases.


# 92719 19-Mar-2002 alfred

Remove __P


# 92462 16-Mar-2002 mckusick

Add a flags parameter to VFS_VGET to pass through the desired
locking flags when acquiring a vnode. The immediate purpose is
to allow polling lock requests (LK_NOWAIT) needed by soft updates
to avoid deadlock when enlisting other processes to help with
the background cleanup. For the future it will allow the use of
shared locks for read access to vnodes. This change touches a
lot of files as it affects most filesystems within the system.
It has been well tested on FFS, loopback, and CD-ROM filesystems.
only lightly on the others, so if you find a problem there, please
let me (mckusick@mckusick.com) know.


# 91859 08-Mar-2002 phk

Move the mount of the root filesystem to happen in the init process before
the exec if /sbin/init.

This allows the scheduler to get started and kthreads a chance to run
before we start filesystem operations.


# 91702 05-Mar-2002 rwatson

Reserve a mount flag, MNT_MULTILABEL, used by the MAC subsystem and
individual filesystems to determine whether they should operate in
"file system as a single object" mode, or "file system as a set of objects
with individual labels" mode. Note: in the trustedbsd_mac branch,
this is refered to as "MNT_MULTILEVEL", but the two mean the same thing.
MNT_MULTILABEL is more suggestive of a flexible policy system than one
providing purely hierarchal policies. The need for a reserved flag will
go away once nmount() is done.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


# 86078 05-Nov-2001 phk

Define a new mount flag "MNT_JAILDEVFS"

Collect the magic combination of flags which can be updated into
a macro in sys/mount.h rather than inlining them (twice!) in
vfs_syscalls.c


# 86037 04-Nov-2001 dillon

Add mnt_reservedvnlist so we can MFC to 4.x, in order to make all mount
structure changes now rather then piecemeal later on. mnt_nvnodelist
currently holds all the vnodes under the mount point. This will eventually
be split into a 'dirty' and 'clean' list. This way we only break kld's once
rather then twice. nvnodelist will eventually turn into the dirty list
and should remain compatible with the klds.


# 85339 22-Oct-2001 dillon

Change the vnode list under the mount point from a LIST to a TAILQ
in preparation for an implementation of limiting code for kern.maxvnodes.

MFC after: 3 days


# 83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 80706 31-Jul-2001 jake

Machine dependent ifdefs for sparc64.


# 79482 09-Jul-2001 des

Constify the fstype argument to vfs_mount(). This eliminates at least one
"call discards qualifier" warning (in sys/compat/linux/linux_file.c).


# 77956 10-Jun-2001 benno

Changes to sys/ includes to support PowerPC.

Reviewed by: obrien, dfr


# 76166 01-May-2001 markm

Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by: bde (with reservations)


# 75934 25-Apr-2001 phk

Move the netexport structure from the fs-specific mountstructure
to struct mount.

This makes the "struct netexport *" paramter to the vfs_export
and vfs_checkexport interface unneeded.

Consequently that all non-stacking filesystems can use
vfs_stdcheckexp().

At the same time, make it a pointer to a struct netexport
in struct mount, so that we can remove the bogus AF_MAX
and #include <net/radix.h> from <sys/mount.h>


# 75890 23-Apr-2001 grog

Back out previous commit.

Requested by: bde


# 75855 23-Apr-2001 grog

Remove bogus #include and duplicate definition of AF_MAX. These were
made necessary by breakage in usr.sbin/pstat and usr.bin/fstat, since
fixed.

Suggested by: phk
Unearthed by: John Hood <jhood@sitaranetworks.com>


# 75847 23-Apr-2001 grog

Add address families AF_SLOW and AF_SCLUSTER. These are used by the
Sitara QoSworks box.

Obtained from: Sitara Networks Inc.


# 74437 19-Mar-2001 rwatson

o Rename "namespace" argument to "attrnamespace" as namespace is a C++
reserved word.

Submitted by: jkh
Obtained from: TrustedBSD Project


# 74273 15-Mar-2001 rwatson

o Change the API and ABI of the Extended Attribute kernel interfaces to
introduce a new argument, "namespace", rather than relying on a first-
character namespace indicator. This is in line with more recent
thinking on EA interfaces on various mailing lists, including the
posix1e, Linux acl-devel, and trustedbsd-discuss forums. Two namespaces
are defined by default, EXTATTR_NAMESPACE_SYSTEM and
EXTATTR_NAMESPACE_USER, where the primary distinction lies in the
access control model: user EAs are accessible based on the normal
MAC and DAC file/directory protections, and system attributes are
limited to kernel-originated or appropriately privileged userland
requests.

o These API changes occur at several levels: the namespace argument is
introduced in the extattr_{get,set}_file() system call interfaces,
at the vnode operation level in the vop_{get,set}extattr() interfaces,
and in the UFS extended attribute implementation. Changes are also
introduced in the VFS extattrctl() interface (system call, VFS,
and UFS implementation), where the arguments are modified to include
a namespace field, as well as modified to advoid direct access to
userspace variables from below the VFS layer (in the style of recent
changes to mount by adrian@FreeBSD.org). This required some cleanup
and bug fixing regarding VFS locks and the VFS interface, as a vnode
pointer may now be optionally submitted to the VFS_EXTATTRCTL()
call. Updated documentation for the VFS interface will be committed
shortly.

o In the near future, the auto-starting feature will be updated to
search two sub-directories to the ".attribute" directory in appropriate
file systems: "user" and "system" to locate attributes intended for
those namespaces, as the single filename is no longer sufficient
to indicate what namespace the attribute is intended for. Until this
is committed, all attributes auto-started by UFS will be placed in
the EXTATTR_NAMESPACE_SYSTEM namespace.

o The default POSIX.1e attribute names for ACLs and Capabilities have
been updated to no longer include the '$' in their filename. As such,
if you're using these features, you'll need to rename the attribute
backing files to the same names without '$' symbols in front.

o Note that these changes will require changes in userland, which will
be committed shortly. These include modifications to the extended
attribute utilities, as well as to libutil for new namespace
string conversion routines. Once the matching userland changes are
committed, a buildworld is recommended to update all the necessary
include files and verify that the kernel and userland environments
are in sync. Note: If you do not use extended attributes (most people
won't), upgrading is not imperative although since the system call
API has changed, the new userland extended attribute code will no longer
compile with old include files.

o Couple of minor cleanups while I'm there: make more code compilation
conditional on FFS_EXTATTR, which should recover a bit of space on
kernels running without EA's, as well as update copyright dates.

Obtained from: TrustedBSD Project


# 73286 01-Mar-2001 adrian

Reviewed by: jlemon

An initial tidyup of the mount() syscall and VFS mount code.

This code replaces the earlier work done by jlemon in an attempt to
make linux_mount() work.

* the guts of the mount work has been moved into vfs_mount().

* move `type', `path' and `flags' from being userland variables into being
kernel variables in vfs_mount(). `data' remains a pointer into
userspace.

* Attempt to verify the `type' and `path' strings passed to vfs_mount()
aren't too long.

* rework mount() and linux_mount() to take the userland parameters
(besides data, as mentioned) and pass kernel variables to vfs_mount().
(linux_mount() already did this, I've just tidied it up a little more.)

* remove the copyin*() stuff for `path'. `data' still requires copyin*()
since its a pointer into userland.

* set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each
filesystem. This variable is generally initialised with `path', and
each filesystem can override it if they want to.

* NOTE: f_mntonname is intiailised with "/" in the case of a root mount.


# 72650 18-Feb-2001 green

Switch to using a struct xucred instead of a struct xucred when not
actually in the kernel. This structure is a different size than
what is currently in -CURRENT, but should hopefully be the last time
any application breakage is caused there. As soon as any major
inconveniences are removed, the definition of the in-kernel struct
ucred should be conditionalized upon defined(_KERNEL).

This also changes struct export_args to remove dependency on the
constantly-changing struct ucred, as well as limiting the bounds
of the size fields to the correct size. This means: a) mountd and
friends won't break all the time, b) mountd and friends won't crash
the kernel all the time if they don't know what they're doing wrt
actual struct export_args layout.

Reviewed by: bde


# 72537 16-Feb-2001 jlemon

Introduce copyinfrom and copyinstrfrom, which can copy data from either
user or kernel space. This will allow layering of os-compat (e.g.: linux)
system calls. Apply the changes to mount.


# 69564 04-Dec-2000 alfred

remove struct mount from useland visibility


# 67073 13-Oct-2000 bde

Fixed namespace pollution in rev.1.78. Don't export <sys/stat.h> to
userland from here; just forward declare struct stat. fhstat.2
(== fhopen.2 == fhstatfs.2) has always specified including
<sys/stat.h> before using any of the fh functions although this is
only necessary for dereferencing the "struct stat *" arg of fhstat(),
so applications should not notice this change.

Fixed unsorting of user prototypes in rev.1.78.


# 66615 03-Oct-2000 jasone

Convert lockmgr locks from using simple locks to using mutexes.

Add lockdestroy() and appropriate invocations, which corresponds to
lockinit() and must be called to clean up after a lockmgr lock is no
longer needed.


# 66457 29-Sep-2000 dfr

Add ia64 support.


# 62968 11-Jul-2000 mckusick

Clean up warning about undeclared function by declaring softdep_fsync
in mount.h instead of ffs_extern.h. The correct solution is to use
an indirect function pointer so that the kernel does not have to be
built with options FFS, but that will be left for another day.


# 62553 04-Jul-2000 mckusick

Get userland visible flags added for snapshots to give a few days
advance preparation for them to get migrated into place so that
subsequent changes in utilities will not fail to compile for lack
of up-to-date header files in /usr/include.


# 61729 16-Jun-2000 phk

ARGH! I have too many source trees :-(

Fix prototype errors in last commit.


# 61724 16-Jun-2000 phk

Virtualizes & untangles the bioops operations vector.

Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@


# 60938 26-May-2000 jake

Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by: msmith and others


# 60833 23-May-2000 jake

Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by: phk
Reviewed by: phk
Approved by: mdodd


# 56272 19-Jan-2000 rwatson

Fix bde'isms in acl/extattr syscall interface, renaming syscalls to
prettier (?) names, adding some const's around here, et al.

Reviewed by: bde


# 55205 29-Dec-1999 peter

Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.


# 54803 19-Dec-1999 rwatson

Second pass commit to introduce new ACL and Extended Attribute system
calls, vnops, vfsops, both in /kern, and to individual file systems that
require a vfsop_ array entry.

Reviewed by: eivind


# 54051 03-Dec-1999 jkh

Define name length differently for alpha in order to preserve
backwards compatibility.

Submitted by: Andrew Gallatin <gallatin@cs.duke.edu>
Reviewed by: mckusick


# 53975 01-Dec-1999 mckusick

Collect read and write counts for filesystems. This new code
drops the counting in bwrite and puts it all in spec_strategy.
I did some tests and verified that the counts collected for writes
in spec_strategy is identical to the counts that we previously
collected in bwrite. We now also get read counts (async reads
come from requests for read-ahead blocks). Note that you need
to compile a new version of mount to get the read counts printed
out. The old mount binary is completely compatible, the only
reason to install a new mount is to get the read counts printed.

Submitted by: Craig A Soules <soules+@andrew.cmu.edu>
Reviewed by: Kirk McKusick <mckusick@mckusick.com>


# 53452 20-Nov-1999 phk

struct mountlist and struct mount.mnt_list have no business being
a CIRCLEQ. Change them to TAILQ_HEAD and TAILQ_ENTRY respectively.

This removes ugly mp != (void*)&mountlist comparisons.

Requested by: phk
Submitted by: Jake Burkholder jake@checker.org
PR: 14967


# 52735 01-Nov-1999 julian

Most modern OSs have the ability to flag certain mounts as ones to
be ignored by default by the df(1) program. This is used mostly to
avoid stat()-ing entries that do not represent "real" disk mount
points (such as those made by an automounter such as amd.) It is
also useful not to have to stat() these entries because it takes
longer to report them that for other file systems, being that these
mount points are served by a user-level file server and resulting in
several context switches. Worse, if the automounter is down
unexpectedly, a causal df(1) will hang in an interruptible way.

PR: kern/9764
Submitted by: Erez Zadok <ezk@cs.columbia.edu>


# 52419 21-Oct-1999 julian

Whistle's Netgraph link-layer (sometimes more) networking infrastructure.
Been in production for 3 years now. Gives Instant Frame relay to if_sr
and if_ar drivers, and PPPOE support soon. See:
ftp://ftp.whistle.com/pub/archie/netgraph/index.html
for on-line manual pages.

Reviewed by: Doug Rabson (dfr@freebsd.org)
Obtained from: Whistle CVS tree


# 52248 15-Oct-1999 msmith

Implement pseudo_AF_HDRCMPLT, which controls the state of the 'header
completion' flag. If set, the interface output routine will assume that
the packet already has a valid link-level source address. This defaults
to off (the address is overwritten)

PR: kern/10680
Submitted by: "Christopher N . Harrell" <cnh@mindspring.net>
Obtained from: NetBSD


# 51797 29-Sep-1999 phk

Remove v_maxio from struct vnode.

Replace it with mnt_iosize_max in struct mount.

Nits from: bde


# 51388 19-Sep-1999 dillon

Fix BOOTP root FS mounts. Also cleanup vfs_getnewfsid() and collapse
addaliasu() into addalias() (no operational change) and clarify comments
relating to a trick that vclean() uses.

The fix to BOOTP is yet another hack. Actually, rootfsid handling
is already a major hack. The whole thing needs to be cleaned up.

Reviewed by: David Greenman <dg@root.com>, Alan Cox <alc@cs.rice.edu>


# 51138 10-Sep-1999 alfred

Seperate the export check in VFS_FHTOVP, exports are now checked via
VFS_CHECKEXP.

Add fh(open|stat|stafs) syscalls to allow userland to query filesystems
based on (network) filehandle.

Obtained from: NetBSD


# 51068 07-Sep-1999 alfred

All unimplemented VFS ops now have entries in kern/vfs_default.c that return
reasonable defaults.

This avoids confusing and ugly casting to eopnotsupp or making dummy functions.
Bogus casting of filesystem sysctls to eopnotsupp() have been removed.

This should make *_vfsops.c more readable and reduce bloat.

Reviewed by: msmith, eivind
Approved by: phk
Tested by: Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>


# 50477 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 45739 17-Apr-1999 peter

Well folks, this is it - The second stage of the removal for build support
for LKM's..


# 44078 16-Feb-1999 dfr

* Change sysctl from using linker_set to construct its tree using SLISTs.
This makes it possible to change the sysctl tree at runtime.

* Change KLD to find and register any sysctl nodes contained in the loaded
file and to unregister them when the file is unloaded.

Reviewed by: Archie Cobbs <archie@whistle.com>,
Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)


# 41169 15-Nov-1998 bde

Fixed the type and order of vfs_modevent. This fixes part of a spew of
warnings for the recent change of the type of a module event handler.

Fixed a rotted comment (numeric types of filesystems are not listed here).

Made the function protototype in VFS_SET() more like the corresponding
function definition (don't use extern for prototypes).

Enforce a semicolon after the LKM case of VFS_SET().


# 41056 10-Nov-1998 peter

Make the vnode opv vector construction fully dynamic. Previously we
leaked memory on each unload and were limited to items referenced in
the kernel copy of vnode_if.c. Now a kernel module is free to create
it's own VOP_FOO() routines and the rest of the system will happily
deal with it, including passthrough layers like union/umap/etc.

Have VFS_SET() call a common vfs_modevent() handler rather than
inline duplicating the common code all over the place.

Have VNODEOP_SET() have the vnodeops removed at unload time (assuming a
module) so that the vop_t ** vector is reclaimed.

Slightly adjust the vop_t ** vectors so that calling slot 0 is a panic
rather than a page fault. This could happen if VOP_something() was called
without *any* handlers being present anywhere (including in vfs_default.c).
slot 1 becomes the default vector for the vnodeop table.

TODO: reclaim zones on unload (eg: nfs code)


# 40986 07-Nov-1998 peter

oops! s/vfs_register/vfs_unregister/ in the unload case..

Mentioned by: dfr


# 40966 06-Nov-1998 peter

Remove trailing ';' - use the one supplied by the caller: "VFS_SET(foo);"


# 40435 16-Oct-1998 peter

*gulp*. Jordan specifically OK'ed this..

This is the bulk of the support for doing kld modules. Two linker_sets
were replaced by SYSINIT()'s. VFS's and exec handlers are self registered.
kld is now a superset of lkm. I have converted most of them, they will
follow as a seperate commit as samples.
This all still works as a static a.out kernel using LKM's.


# 39271 15-Sep-1998 phk

(this is an extract from src/share/examples/atm/README)

===================================
HARP | Host ATM Research Platform
===================================

HARP 3

What is this stuff?
-------------------
The Advanced Networking Group (ANG) at the Minnesota Supercomputer Center,
Inc. (MSCI), as part of its work on the MAGIC Gigabit Testbed, developed
the Host ATM Research Platform (HARP) software, which allows IP hosts to
communicate over ATM networks using standard protocols. It is intended to
be a high-quality platform for IP/ATM research.

HARP provides a way for IP hosts to connect to ATM networks. It supports
standard methods of communication using IP over ATM. A host's standard IP
software sends and receives datagrams via a HARP ATM interface. HARP provides
functionality similar to (and typically replaces) vendor-provided ATM device
driver software.

HARP includes full source code, making it possible for researchers to
experiment with different approaches to running IP over ATM. HARP is
self-contained; it requires no other licenses or commercial software packages.

HARP implements support for the IETF Classical IP model for using IP over ATM
networks, including:

o IETF ATMARP address resolution client
o IETF ATMARP address resolution server
o IETF SCSP/ATMARP server
o UNI 3.1 and 3.0 signalling protocols
o Fore Systems's SPANS signalling protocol

What's supported
----------------
The following are supported by HARP 3:

o ATM Host Interfaces
- FORE Systems, Inc. SBA-200 and SBA-200E ATM SBus Adapters
- FORE Systems, Inc. PCA-200E ATM PCI Adapters
- Efficient Networks, Inc. ENI-155p ATM PCI Adapters

o ATM Signalling Protocols
- The ATM Forum UNI 3.1 signalling protocol
- The ATM Forum UNI 3.0 signalling protocol
- The ATM Forum ILMI address registration
- FORE Systems's proprietary SPANS signalling protocol
- Permanent Virtual Channels (PVCs)

o IETF "Classical IP and ARP over ATM" model
- RFC 1483, "Multiprotocol Encapsulation over ATM Adaptation Layer 5"
- RFC 1577, "Classical IP and ARP over ATM"
- RFC 1626, "Default IP MTU for use over ATM AAL5"
- RFC 1755, "ATM Signaling Support for IP over ATM"
- RFC 2225, "Classical IP and ARP over ATM"
- RFC 2334, "Server Cache Synchronization Protocol (SCSP)"
- Internet Draft draft-ietf-ion-scsp-atmarp-00.txt,
"A Distributed ATMARP Service Using SCSP"

o ATM Sockets interface
- The file atm-sockets.txt contains further information

What's not supported
--------------------
The following major features of the above list are not currently supported:

o UNI point-to-multipoint support
o Driver support for Traffic Control/Quality of Service
o SPANS multicast and MPP support
o SPANS signalling using Efficient adapters

This software was developed under the sponsorship of the Defense Advanced
Research Projects Agency (DARPA).

Reviewed (lightly) by: phk
Submitted by: Network Computing Services, Inc.


# 38909 07-Sep-1998 bde

Removed statically configured mount type numbers (MOUNT_*) and all
references to them.

The change a couple of days ago to ignore these numbers in statically
configured vfsconf structs was slightly premature because the cd9660,
cfs, devfs, ext2fs, nfs vfs's still used MOUNT_* instead of the number
in their vfsconf struct.


# 38866 05-Sep-1998 bde

Instantiate `nfs_mount_type' in a standard file so that it is present
when nfs is an LKM. Declare it in a header file. Don't forget to use
it in non-Lite2 code. Initialize it to -1 instead of to 0, since 0
will soon be the mount type number for the first vfs loaded.

NetBSD uses strcmp() to avoid this ugly global.


# 38757 02-Sep-1998 bde

Added a vfs_oid pointer and a vfs_uninit() function to struct
vfsops. vfs_oid will be used to attach and detach vfs sysctls
dynamically. vfs_uninit() will be used to clean up before
modunloading vfs LKMs. The nfs LKM needs these features most.


# 38756 02-Sep-1998 bde

Backed out previous commit. VFS_LKM_NO_DEFAULT_DISPATCH wasn't used for
long, and the ifdef for it broke the forward declaration for the
dispatch function.


# 37863 25-Jul-1998 alex

Allow VFS LKMs to override the default module dispatch functions if
VFS_LKM_NO_DEFAULT_DISPATCH is defined.


# 35769 06-May-1998 msmith

As described by the submitter:

Reverse the VFS_VRELE patch. Reference counting of vnodes does not need
to be done per-fs. I noticed this while fixing vfs layering violations.
Doing reference counting in generic code is also the preference cited by
John Heidemann in recent discussions with him.

The implementation of alternative vnode management per-fs is still a valid
requirement for some filesystems but will be revisited sometime later,
most likely using a different framework.

Submitted by: Michael Hancock <michaelh@cet.co.jp>


# 35105 08-Apr-1998 wosch

New mount option nosymfollow. If enabled, the kernel lookup()
function will not follow symbolic links on the mounted
file system and return EACCES (Permission denied).


# 34927 28-Mar-1998 bde

Don't export anything from <sys/socket.h> except AF_MAX from here.
This only affects the KERNEL case.

Don't include <sys/radix.h> twice for the KERNEL case. This fixes
a mismerge from Lite2.

Don't include <sys/radix.h> at all for the !KERNEL case. This fixes
a wrong cleanup in Lite2.


# 34924 28-Mar-1998 bde

Moved some #includes from <sys/param.h> nearer to where they are actually
used.


# 34266 08-Mar-1998 julian

Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman)
Submitted by: Kirk McKusick (mcKusick@mckusick.com)
Obtained from: WHistle development tree


# 33964 01-Mar-1998 msmith

The intent is to get rid of WILLRELE in vnode_if.src by making
a complement to all ops that return a vpp, VFS_VRELE. This is
initially only for file systems that implement the following ops
that do a WILLRELE:

vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link,
vop_rename, vop_mkdir, vop_rmdir, vop_symlink

This is initial DNA that doesn't do anything yet. VFS_VRELE is
implemented but not called.

A default vfs_vrele was created for fs implementations that use the
standard vnode management routines.

VFS_VRELE implementations were made for the following file systems:

Standard (vfs_vrele)
ffs mfs nfs msdosfs devfs ext2fs

Custom
union umapfs

Just EOPNOTSUPP
fdesc procfs kernfs portal cd9660

These implementations may change as VOP changes are implemented.

In the next phase, in the vop implementations calls to vrele and the vrele
part of vput will be moved to the top layer vfs_vnops and made visible
to all layers. vput will be replaced by unlock in these cases. Unlocking
will still be done in the per fs layer but the refcount decrement will be
triggered at the top because it doesn't hurt to hold a vnode reference a
little longer. This will have minimal impact on the structure of the
existing code.

This will only be done for vnode arguments that are released by the various
fs vop implementations.

Wider use of VFS_VRELE will likely require restructuring of the code.

Reviewed by: phk, dyson, terry et. al.
Submitted by: Michael Hancock <michaelh@cet.co.jp>


# 33733 21-Feb-1998 jkh

MF22: correct comments.


# 33722 21-Feb-1998 jkh

MF22: CODA entries. They'll have to rework their usage of malloc
somewhat in -current before this will work, but these should at least
serve as place-holders.


# 33122 05-Feb-1998 dyson

Add MNT_LAZY.


# 32644 20-Jan-1998 bde

Moved most of the (source-level) compatibility hacks for the vfsconf
interface from sys/mount.h to libc/getvfsent.c The new interface is
now the default.

Sorted the prototypes for the library functions.


# 31403 25-Nov-1997 julian

Shift a few SYSINT() calls around.
this results in a few functions becoming static, and
the SYSINITs being close to the code they are related to.
setting up the dump device is with dumpsys() and
kicking off the scheduler is with the scheduler.
Mounting root is with the code that does it.

Reviewed by: phk


# 31346 22-Nov-1997 bde

Fixed some style and contents bugs in comments. Copied comments are
usually wrong.


# 31144 12-Nov-1997 julian

Reviewed by: hackers@freebsd.org in general
Obtained from: Whistle Communications tree

Add an option to the way UFS works dependent on the SUID bit of directories
This changes makes things a whole lot simpler on systems running as
fileservers for PCs and MACS. to enable the new code you must
1/ enable option SUIDDIR on the kernel.
2/ mount the filesystem with option suiddir.
hopefully this makes it difficult enough for people to
do this accidentally.
see the new chmod(2) man page for detailed info.


# 31132 12-Nov-1997 julian

Reviewed by: various.

Ever since I first say the way the mount flags were used I've hated the
fact that modes, and events, internal and exported, and short-term
and long term flags are all thrown together. Finally it's annoyed me enough..
This patch to the entire FreeBSD tree adds a second mount flag word
to the mount struct. it is not exported to userspace. I have moved
some of the non exported flags over to this word. this means that we now
have 8 free bits in the mount flags. There are another two that might
well move over, but which I'm not sure about.
The only user visible change would have been in pstat -v, except
that davidg has disabled it anyhow.
I'd still like to move the state flags and the 'command' flags
apart from each other.. e.g. MNT_FORCE really doesn't have the
same semantics as MNT_RDONLY, but that's left for another day.


# 30354 12-Oct-1997 phk

Last major round (Unless Bruce thinks of somthing :-) of malloc changes.

Distribute all but the most fundamental malloc types. This time I also
remembered the trick to making things static: Put "static" in front of
them.

A couple of finer points by: bde


# 29888 27-Sep-1997 kato

Clustered read and write are switched at mount-option level.

1. Clustered I/O is switched by the MNT_NOCLUSTERR and MNT_NOCLUSTERW
bits of the mnt_flag. The sysctl variables, vfs.foo.doclusterread
and vfs.foo.doclusterwrite are deleted. Only mount option can
control clustered I/O from userland.
2. When foofs_mount mounts block device, foofs_mount checks D_CLUSTERR
and D_CLUSTERW bits of the d_flags member in the block device switch
table. If D_NOCLUSTERR / D_NOCLUSTERW are set, MNT_NOCLUSTERR /
MNT_NOCLUSTERW bits will be set. In this case, MNT_NOCLUSTERR and
MNT_NOCLUSTERW cannot be cleared from userland.
3. Vnode driver disables both clustered read and write.
4. Union filesystem disables clutered write.

Reviewed by: bde


# 29513 16-Sep-1997 bde

Drop temporary source-level compatibility for old mount(2) interface.


# 28270 16-Aug-1997 wollman

Fix all areas of the system (or at least all those in LINT) to avoid storing
socket addresses in mbufs. (Socket buffers are the one exception.) A number
of kernel APIs needed to get fixed in order to make this happen. Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead. Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.


# 27606 22-Jul-1997 bde

Quick and dirty (?) fix for noatime option. The WebNFS changes
broke it by using the same value for MNT_EXPUBLIC as for MNT_NOATIME.
Just use a different value for MNT_EXPUBLIC.


# 27461 16-Jul-1997 dfr

Merge WebNFS changes from NetBSD.

Obtained from: NetBSD


# 24674 06-Apr-1997 dufault

Make MOD_* macros almost consistent:

Use the name argument almost the same in all LKM types. Maintain
the current behavior for the external (e.g., modstat) name for DEV,
EXEC, and MISC types being #name ## "_mod" and SYCALL and VFS only
#name. This is a candidate for change and I vote just the name without
the "_mod".

Change the DISPATCH macro to MOD_DISPATCH for consistency with the
other macros.

Add an LKM_ANON #define to eliminate the magic -1 and associated
signed/unsigned warnings.

Add MOD_PRIVATE to support wcd.c's poking around in the lkm structure.

Change source in tree to use the new interface.

Reviewed by: Bruce Evans


# 23331 03-Mar-1997 bde

Fixed the getvfsbyname macro hack.


# 23289 02-Mar-1997 bde

Restored some pre-Lite2-merge source-level compatibility to the mount()
and getvfsbyname() interfaces. The new interfaces are now hidden from
applications unless _NEW_VFSCONF is defined. The new vfsconf interfaces
don't work yet.


# 22975 22-Feb-1997 peter

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 22602 12-Feb-1997 mpp

Remove function prototypes for vfs_mountroot and vgoneall, since
they were removed with the Lite2 merge.

Submitted by: bde


# 22579 12-Feb-1997 mpp

Add function prototypes for most of the new Lite2 functions.
Also made a few of the miscfs routines static to be
consistent. Some modules simply required some additional
#includes to remove -Wall warnings.


# 22521 10-Feb-1997 dyson

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 18990 17-Oct-1996 jkh

Some very small changes to support Netcon's TFS filesystem.
These patches were formerly applied by the Netcon installer
before rebuilding your kernel.


# 18257 12-Sep-1996 bde

Added a struct tag `fsid' for fsid_t so that sysproto.h can declare
prototypes for the lfs syscalls without having to include <sys/mount.h>
and its nested spam.


# 18006 03-Sep-1996 dg

Implemented kernel side of MNT_NOATIME mount option. This option disables
the file access time update on reads and can be useful in reducing
filesystem overhead in cases where the access time is not important (like
Usenet news spools).


# 13765 30-Jan-1996 mpp

Fix a bunch of spelling errors in the comment fields of
a bunch of system include files.


# 12972 22-Dec-1995 phk

Remove the now obsolete vfs_sysctl vfsops element.


# 12255 13-Nov-1995 bde

Replaced nosys() by lkm_nullcmd().


# 12117 05-Nov-1995 dyson

Changes to existing files for ext2fs support. The UFS mods need rework
in the future as they are a bit crufty -- but at least the stuff is in the
tree now.


# 10653 09-Sep-1995 dg

Fixed init functions argument type - caddr_t -> void *. Fixed a couple of
compiler warnings.


# 10431 29-Aug-1995 bde

Declare vfs_mountroot() in the right place.


# 10358 28-Aug-1995 julian

Reviewed by: julian with quick glances by bruce and others
Submitted by: terry (terry lambert)
This is a composite of 3 patch sets submitted by terry.
they are:
New low-level init code that supports loadbal modules better
some cleanups in the namei code to help terry in 16-bit character support
some changes to the mount-root code to make it a little more
modular..

NOTE: mounting root off cdrom or NFS MIGHT be broken as I haven't been able
to test those cases..

certainly mounting root of disk still works just fine..
mfs should work but is untested. (tomorrows task)

The low level init stuff includes a total rewrite of init_main.c
to make it possible for new modules to have an init phase by simply
adding an entry to a TEXT_SET (or is it DATA_SET) list. thus a new module can
be added to the kernel without editing any other files other than the
'files' file.


# 10289 26-Aug-1995 dg

Killed MNT_NOAUTO.


# 10201 23-Aug-1995 jkh

Damn! As Rod just reminded me, I didn't apply the tweak to make
this obey the mask properly. I had it locally, but not in the diffs
I brought across.. :-( Thanks, Rod.


# 10200 23-Aug-1995 jkh

Support for NOAUTO mounts.
Submitted by: "Full Name Not Supplied" <simon@masi.ibp.fr>


# 10027 11-Aug-1995 dg

Converted mountlist to a CIRCLEQ.

Partially obtained from: 4.4BSD-Lite2


# 9336 27-Jun-1995 dfr

Changes to support version 3 of the NFS protocol.
The version 2 support has been tested (client+server) against FreeBSD-2.0,
IRIX 5.3 and FreeBSD-current (using a loopback mount). The version 2 support
is stable AFAIK.
The version 3 support has been tested with a loopback mount and minimally
against an IRIX 5.3 server. It needs more testing and may have problems.
I have patched amd to support the new variable length filehandles although
it will still only use version 2 of the protocol.

Before booting a kernel with these changes, nfs clients will need to at least
build and install /usr/sbin/mount_nfs. Servers will need to build and
install /usr/sbin/mountd.

NFS diskless support is untested.

Obtained from: Rick Macklem <rick@snowhite.cis.uoguelph.ca>


# 8876 30-May-1995 rgrimes

Remove trailing whitespace.


# 8692 21-May-1995 dg

Changes to fix the following bugs:

1) Files weren't properly synced on filesystems other than UFS. In some
cases, this lead to lost data. Most likely would be noticed on NFS.
The fix is to make the VM page sync/object_clean general rather than
in each filesystem.
2) Mixing regular and mmaped file I/O on NFS was very broken. It caused
chunks of files to end up as zeroes rather than the intended contents.
The fix was to fix several race conditions and to kludge up the
"b_dirtyoff" and "b_dirtyend" that NFS relies upon - paying attention
to page modifications that occurred via the mmapping.

Reviewed by: David Greenman
Submitted by: John Dyson


# 7945 20-Apr-1995 julian

Reviewed by: no-one yet, but non-intrusive
Submitted by: julian@tfs.com
Obtained from: written from scratch

slight changes to make space for devfs..
(also conditional test code in i386/isa/fd.c)

===================================================================
RCS file: /home/ncvs/src/sys/sys/malloc.h,v
retrieving revision 1.7
diff -r1.7 malloc.h
113a114,117
> #define M_DEVFSMNT 62 /* DEVFS mount structure */
> #define M_DEVFSBACK 63 /* DEVFS Back node */
> #define M_DEVFSFRONT 64 /* DEVFS Front node */
> #define M_DEVFSNODE 65 /* DEVFS node */
184c188,192
< NULL, NULL, NULL, NULL, NULL, \
---
> "DEVFS mount", /* 62 M_DEVFSMNT */ \
> "DEVFS back", /* 63 M_DEVFSBACK */ \
> "DEVFS front", /* 64 M_DEVFSFRONT */ \
> "DEVFS node", /* 65 M_DEVFSNODE */ \
> NULL, \
Index: sys/mount.h
===================================================================
RCS file: /home/ncvs/src/sys/sys/mount.h,v
retrieving revision 1.16
diff -r1.16 mount.h
100c100,101
< #define MOUNT_MAXTYPE 15
---
> #define MOUNT_DEVFS 16 /* existing device Filesystem */
> #define MOUNT_MAXTYPE 16
118a120
> "devfs", /* 15 MOUNT_DEVFS */ \
Index: sys/vnode.h
===================================================================
RCS file: /home/ncvs/src/sys/sys/vnode.h,v
retrieving revision 1.19
diff -r1.19 vnode.h
61c61
< VT_UNION, VT_MSDOSFS
---
> VT_UNION, VT_MSDOSFS, VT_DEVFS


# 7743 10-Apr-1995 wollman

Correct name `cd9660' for MOUNT_CD9660 (but NB that this whole table
is bogus and only exists for the benefit of find(1)). Old name was
`iso9660fs'.

Submitted by: Andrew Atrens <atreand@statcan.ca>


# 7095 16-Mar-1995 wollman

Add four more filesystem flags:

VFCF_NETWORK (this FS goes over the net)
VFCF_READONLY (read-write mounts do not make any sense)
VFCF_SYNTHETIC (data in this FS is not real)
VFCF_LOOPBACK (this FS aliases something else)

cd9660 is readonly; nullfs, umapfs, and union are loopback; NFS is netowkr;
procfs, kernfs, and fdesc are synthetic.


# 7093 16-Mar-1995 wollman

Statically-compiled filesystems now use a VFCF_STATIC flag rather than
abusing the refcount.


# 7090 16-Mar-1995 bde

Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'. Fix all the bugs found. There were no serious
ones.


# 3731 19-Oct-1994 wollman

Actually implement the functionality documented in sysctl.h for type CTL_FS.
(Namely, call a filesystem-dependent sysctl function analogous to how it works
for networking and (now) physical devices.)


# 3151 27-Sep-1994 phk

ktrace.c: added decl of ktrnamei
lkm.h: added decl of lkmdispatch
mount.h: added decl of vfs_busy,vfs_unbusy
syscall: The "created from" changed.


# 2962 21-Sep-1994 wollman

mount.h: Declare getvfs* functions from libc.
vfs_init.c: Fix fs_sysctl() so that getvfs* functions actually work.


# 2960 21-Sep-1994 wollman

Fix a few niggling little bugs:

- set args->lkm_offset correctly so that VFS modules can be unloaded
- initialize _fs_vfsops.vfc_refcount correctly so that VFS modules can
be unloaded
- include kernel.h in a few placves to get the correct definition of DATA_SET


# 2946 21-Sep-1994 wollman

Implemented loadable VFS modules, and made most existing filesystems
loadable. (NFS is a notable exception.)


# 2893 19-Sep-1994 dfr

Added msdosfs.

Obtained from: NetBSD


# 2811 15-Sep-1994 bde

Add some prototypes.


# 2213 22-Aug-1994 bde

- Fix warnings in df, etc. caused by misplaced declaration of doumount().
- Fix bogus comments caused by misplaced #endif.


# 2165 21-Aug-1994 paul

Made them all idempotent.
Reviewed by:
Submitted by:


# 2152 20-Aug-1994 dg

Implemented filesystem clean bit via:

machdep.c:
Changed printf's a little and call vfs_unmountall() if the sync was
successful.

cd9660_vfsops.c, ffs_vfsops.c, nfs_vfsops.c, lfs_vfsops.c:
Allow dismount of root FS. It is now disallowed at a higher level.

vfs_conf.c:
Removed unused rootfs global.

vfs_subr.c:
Added new routines vfs_unmountall and vfs_unmountroot. Filesystems
are now dismounted if the machine is properly rebooted.

ffs_vfsops.c:
Toggle clean bit at the appropriate places. Print warning if an
unclean FS is mounted.

ffs_vfsops.c, lfs_vfsops.c:
Fix bug in selecting proper flags for VOP_CLOSE().

vfs_syscalls.c:
Disallow dismounting root FS via umount syscall.


# 1817 02-Aug-1994 dg

Added $Id$


# 1542 24-May-1994 rgrimes

This commit was generated by cvs2svn to compensate for changes in r1541,
which included commits to RCS files with non-trunk default branches.


# 1541 24-May-1994 rgrimes

BSD 4.4 Lite Kernel Sources