History log of /freebsd-9.3-release/sys/nfsserver/nfs_serv.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 267654 19-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 265723 08-May-2014 rmacklem

MFC: r264888
The PR reported that the old NFS server did not set uio_td == NULL
for the VOP_READ() call. This patch fixes both the old and new
server for this case.


# 244658 24-Dec-2012 kib

MFC r241025:
Fix the mis-handling of the VV_TEXT on the nullfs vnodes.
Add a set of VOPs for the VV_TEXT query, set and clear operations,
which are correctly bypassed to lower vnode.


# 229617 05-Jan-2012 jhb

MFC 228185:
Enhance the sequential access heuristic used to perform readahead in the
NFS server and reuse it for writes as well to allow writes to the backing
store to be clustered.


# 225736 22-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


# 225356 02-Sep-2011 rmacklem

Fix the NFS servers so that they can do a Lookup of "..",
which requires that ni_strictrelative be set to 0, post-r224810.

Tested by: swills (earlier version), geo dot liaskos at gmail.com
Approved by: re (kib)


# 219028 25-Feb-2011 netchild

Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/
PMC/SYSV/...).

No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by: Google Summer of Code 2010
Submitted by: kibab
Reviewed by: arch@ (parts by rwatson, trasz, jhb)
X-MFC after: to be determined in last commit with code from this project


# 218345 05-Feb-2011 alc

Unless "cnt" exceeds MAX_COMMIT_COUNT, nfsrv_commit() and nfsvno_fsync() are
incorrectly calling vm_object_page_clean(). They are passing the length of
the range rather than the ending offset of the range.

Perform the OFF_TO_IDX() conversion in vm_object_page_clean() rather than the
callers.

Reviewed by: kib
MFC after: 3 weeks


# 216774 28-Dec-2010 pjd

ZFS might not return monotonically increasing directory offset cookies,
so turn off UFS-specific hack that assumes so in ZFS case.
Before the change we can miss returning some directory entries to a
NFS client.

I believe that the hack should be moved to ufs_readdir(), but until we find
somebody who will do it, turn it off for ZFS in NFS server code.

Submitted by: rmacklem
Discussed with: rmacklem, mckusick
MFC after: 3 days


# 216633 21-Dec-2010 pjd

Use newly added NFSRV_FLAG_BUSY flag for nfsrv_fhtovp() to keep mount point
busy. This fixes a race where we can pass invalid mount point to VFS_VGET()
via vp->v_mount when exported file system was forcibly unmounted between
nfsrv_fhtovp() and VFS_VGET().

Reviewed by: kib
MFC after: 5 days


# 216632 21-Dec-2010 pjd

- Move pubflag and lockflag handling from nfsrv_fhtovp() to nfs_namei() -
this is the only place that is different from all the other nfsrv_fhtovp()
consumers.
This simplifies nfsrv_fhtovp() a bit and also eliminates one
vn_lock/VOP_UNLOCK() cycle in case of NFSv3.
- Implement NFSRV_FLAG_BUSY flag for nfsrv_fhtovp() that tells it to leave
mount point busy.

Reviewed by: kib
MFC after: 5 days


# 216565 19-Dec-2010 pjd

Reduce lock scope a little.


# 216454 15-Dec-2010 kib

VOP_ISLOCKED() should not be used to determine if the vnode is locked.
Explicitely track the locked status of the vnode.

Reviewed by: pjd
Tested by: avg
MFC after: 1 week


# 214851 05-Nov-2010 kib

Fix a bug in r214049. The nvp == vp case shall be handled specially
only for !usevget case. If VFS_VGET is working, the vnode shared lock
is obtained recursively and vput() shall be done, not vunref().

Submitted by: rmacklem
Tested by: Josh Carroll <josh.carroll gmail com>
MFC after: 3 days


# 214049 19-Oct-2010 kib

When readdirplus() is handled on the exported filesystem that does
not support VFS_VGET, like msdosfs, do not call VOP_LOOKUP() for
dotdot on the root directory. Our filesystems expect that VFS handles
dotdot lookups on root on its own.

Reported and tested by: kevlo
MFC after: 2 weeks


# 211854 26-Aug-2010 pjd

- When VFS_VGET() is not supported, switch to VOP_LOOKUP().
- We are fine by only share-locking the vnode.
- Remove assertion that doesn't hold for ZFS where we cross mount points
boundaries by going into .zfs/snapshot/<name>/.

Reviewed by: rmacklem
MFC after: 1 month


# 200084 03-Dec-2009 jhb

Properly return an error reply if an NFS remove or link operation fails.
Previously the failing operation would allocate an mbuf and construct an
error reply, but because the function did not return 0, the NFS server
assumed it had failed to generate a reply and would leak the reply mbuf as
well as not sending the reply to the NFS client.

PR: kern/140853
Submitted by: Ted Faber faber at isi edu (remove)
Reviewed by: rmacklem (remove)
MFC after: 1 week


# 197525 26-Sep-2009 pjd

Ensure that tv_sec is between INT32_MIN and INT32_MAX, so ZFS won't object.
This completes the fix from r185586.

PR: kern/139059
Reported by: Daniel Braniss <danny@cs.huji.ac.il>
Submitted by: Jaakko Heinonen <jh@saunalahti.fi>
Tested by: Daniel Braniss <danny@cs.huji.ac.il>
MFC after: 3 days


# 197040 09-Sep-2009 pjd

Correct typo after manual patching.

Noticed by: b. f.


# 197039 09-Sep-2009 pjd

Fix usecount leak in mknod(2) on file system exported over NFS.

While I'm here, correct typo in comment.

Reviewed by: kan, kib
MFC after: 3 days


# 195202 30-Jun-2009 dfr

Remove the old kernel RPC implementation and the NFS_LEGACYRPC option.

Approved by: re


# 191990 11-May-2009 attilio

Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS. Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled. Bump __FreeBSD_version in order to signal such
situation.


# 188588 13-Feb-2009 jhb

Use shared vnode locks when invoking VOP_READDIR().

MFC after: 1 month


# 186165 16-Dec-2008 kensmith

Handle VFS_VGET() failing with an error other than EOPNOTSUPP in addition
to failing with that error.

PR: 125149
Submitted by: Jaakko Heinonen (jh <at> saunalahti <dot> fi)
Reviewed by: mohans, kan
MFC after: 3 days


# 185586 03-Dec-2008 kan

Change nfsserver slightly so that it does not trip over the timestamp
validation code on ZFS.

Problem: when opening file with O_CREAT|O_EXCL NFS has to jump through
extra hoops to ensure O_EXCL semantics. Namely, client supplies of 8
bytes (NFSX_V3CREATEVERF) bytes of verification data to uniquely
identify this create request. Server then creates a new file with access
mode 0, copies received 8 bytes into va_atime member of struct vattr and
attempt to set the atime on file using VOP_SETATTR. If that succeeds, it
fetches file attributes with VOP_GETATTR and verifies that atime
timestamps match. If timestamps do not match, NFS server concludes it
has probbaly lost the race to another process creating the file with the
same name and bails with EEXIST.

This scheme works OK when exported FS is FFS, but if underlying
filesystem is ZFS _and_ server is running 64bit kernel, it breaks down
due to sanity checking in zfs_setattr function, which refuses to accept
any timestamps which have tv_sec that cannot be represented as 32bit
int. Since struct timespec fields are 64 bit integers on 64bit platforms
and server just copies NFSX_V3CREATEVERF bytes info va_atime, all eight
bytes supplied by client end up in va_atime.tv_sec, forcing it out of
valid 32bit range.

The solution this change implements is simple: it treats
NFSX_V3CREATEVERF as two 32bit integers and unpacks them separately into
va_atime.tv_sec and va_atime.tv_nsec respectively, thus guaranteeing
that tv_sec remains in 32 bit range and ZFS remains happy.

Reviewed by: kib


# 184588 03-Nov-2008 dfr

Implement support for RPCSEC_GSS authentication to both the NFS client
and server. This replaces the RPC implementation of the NFS client and
server with the newer RPC implementation originally developed
(actually ported from the userland sunrpc code) to support the NFS
Lock Manager. I have tested this code extensively and I believe it is
stable and that performance is at least equal to the legacy RPC
implementation.

The NFS code currently contains support for both the new RPC
implementation and the older legacy implementation inherited from the
original NFS codebase. The default is to use the new implementation -
add the NFS_LEGACYRPC option to fall back to the old code. When I
merge this support back to RELENG_7, I will probably change this so
that users have to 'opt in' to get the new code.

To use RPCSEC_GSS on either client or server, you must build a kernel
which includes the KGSSAPI option and the crypto device. On the
userland side, you must build at least a new libc, mountd, mount_nfs
and gssd. You must install new versions of /etc/rc.d/gssd and
/etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf.

As long as gssd is running, you should be able to mount an NFS
filesystem from a server that requires RPCSEC_GSS authentication. The
mount itself can happen without any kerberos credentials but all
access to the filesystem will be denied unless the accessing user has
a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There
is currently no support for situations where the ticket file is in a
different place, such as when the user logged in via SSH and has
delegated credentials from that login. This restriction is also
present in Solaris and Linux. In theory, we could improve this in
future, possibly using Brooks Davis' implementation of variant
symlinks.

Supporting RPCSEC_GSS on a server is nearly as simple. You must create
service creds for the server in the form 'nfs/<fqdn>@<REALM>' and
install them in /etc/krb5.keytab. The standard heimdal utility ktutil
makes this fairly easy. After the service creds have been created, you
can add a '-sec=krb5' option to /etc/exports and restart both mountd
and nfsd.

The only other difference an administrator should notice is that nfsd
doesn't fork to create service threads any more. In normal operation,
there will be two nfsd processes, one in userland waiting for TCP
connections and one in the kernel handling requests. The latter
process will create as many kthreads as required - these should be
visible via 'top -H'. The code has some support for varying the number
of service threads according to load but initially at least, nfsd uses
a fixed number of threads according to the value supplied to its '-n'
option.

Sponsored by: Isilon Systems
MFC after: 1 month


# 184561 02-Nov-2008 trhodes

Document a few sysctls in the NFS client and server code.
Minor style(9) where applicable.

Approved by: alfred (slightly older version)


# 184413 28-Oct-2008 trasz

Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.

Approved by: rwatson (mentor)


# 184205 23-Oct-2008 des

Retire the MALLOC and FREE macros. They are an abomination unto style(9).

MFC after: 3 months


# 183809 12-Oct-2008 rwatson

Turn XXX's for unlocked writes of NFS server statistics to simple notes,
as we consider it a feature to exchange performance for consistency.

MFC after: 3 days


# 183113 17-Sep-2008 attilio

Remove the suser(9) interface from the kernel. It has been replaced from
years by the priv_check(9) interface and just very few places are left.
Note that compatibility stub with older FreeBSD version
(all above the 8 limit though) are left in order to reduce diffs against
old versions. It is responsibility of the maintainers for any module, if
they think it is the case, to axe out such cases.

This patch breaks KPI so __FreeBSD_version will be bumped into a later
commit.

This patch needs to be credited 50-50 with rwatson@ as he found time to
explain me how the priv_check() works in detail and to review patches.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
Reviewed by: rwatson


# 183103 16-Sep-2008 attilio

Decontext-alize the nfsserver module.
Now, only some few places still require thread passing (mostly the ones which
access to VOP_* functions) and will be fixed once the primitive also will be.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>


# 182371 28-Aug-2008 attilio

Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>


# 179381 28-May-2008 kib

Change the fix in the rev. 1.179 to use nfsrv_lockedpair_nd().

Tested by: pho
MFC after: 3 days


# 179380 28-May-2008 kib

Initialize vfslocked prior to calling nfsm_srvmtofh where it was forgotten.

Reported by: Andrew Edwards <aedwards sandvine com>
Tested by: pho
MFC after: 3 days


# 177599 25-Mar-2008 ru

Replaced the misleading uses of a historical artefact M_TRYWAIT with M_WAIT.
Removed dead code that assumed that M_TRYWAIT can return NULL; it's not true
since the advent of MBUMA.

Reviewed by: arch

There are ongoing disputes as to whether we want to switch to directly using
UMA flags M_WAITOK/M_NOWAIT for mbuf(9) allocation.


# 177493 22-Mar-2008 jeff

- Complete part of the unfinished bufobj work by consistently using
BO_LOCK/UNLOCK/MTX when manipulating the bufobj.
- Create a new lock in the bufobj to lock bufobj fields independently.
This leaves the vnode interlock as an 'identity' lock while the bufobj
is an io lock. The bufobj lock is ordered before the vnode interlock
and also before the mnt ilock.
- Exploit this new lock order to simplify softdep_check_suspend().
- A few sync related functions are marked with a new XXX to note that
we may not properly interlock against a non-zero bv_cnt when
attempting to sync all vnodes on a mountlist. I do not believe this
race is important. If I'm wrong this will make these locations easier
to find.

Reviewed by: kib (earlier diff)
Tested by: kris, pho (earlier diff)


# 176790 04-Mar-2008 kib

Fix the Giant leak in the nfsrv_remove().

Reported by: pluknet <pluknet gmail com>
MFC after: 1 week


# 175294 13-Jan-2008 attilio

VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>


# 175202 09-Jan-2008 attilio

vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by: Diego Sardina <siarodx at gmail dot com>,
Andrea Di Pasquale <whyx dot it at gmail dot com>


# 172759 18-Oct-2007 jhb

Add a -z flag to nfsstat which zeros the NFS statistics after displaying
them.

MFC after: 1 week
Requested by: ps
Submitted by: ps (6 years ago)


# 170689 13-Jun-2007 rwatson

Include priv.h to pick up suser(9) definitions, missed in an earlier
commit.

Warnings spotted by: kris


# 170488 10-Jun-2007 mjacob

Init timespec to zero fo quiesce warnings.


# 167899 26-Mar-2007 jhb

Initialize vfslocked to 0 before nfsm_srvmtofh() so that the variable is
not used uninitialized in 'nfsmout' if nfsm_srvmtofh() gets an internal
error.

CID: 1766
Found by: Coverity Prevent (tm)


# 167665 17-Mar-2007 jeff

- Turn all explicit giant acquires into conditional VFS_LOCK_GIANTs.
Only ops which used namei still remained.
- Implement a scheme for reducing the overhead of tracking which vops
require giant by constantly reducing the number of recursive giant
acquires to one, leaving us with only one vfslocked variable.
- Remove all NFSD lock acquisition and release from the individual nfs
ops. Careful examination has shown that they are not required. This
greatly simplifies the code.

Sponsored by: Isilon Systems, Inc.
Discussed with: rwatson
Tested by: kkenn
Approved by: re


# 166774 15-Feb-2007 pjd

Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method.
This way we may support multiple structures in v_data vnode field within
one file system without using black magic.

Vnode-to-file-handle should be VOP in the first place, but was made VFS
operation to keep interface as compatible as possible with SUN's VFS.
BTW. Now Solaris also implements vnode-to-file-handle as VOP operation.

VFS_VPTOFH() was left for API backward compatibility, but is marked for
removal before 8.0-RELEASE.

Approved by: mckusick
Discussed with: many (on IRC)
Tested with: ufs, msdosfs, cd9660, nullfs and zfs


# 166683 13-Feb-2007 mpp

Get the vfs giant lock before calling nfs_access.

Reviewed by: mohan


# 164585 24-Nov-2006 rwatson

Push Giant a bit further off the NFS server in a number of straight
forward cases by converting from unconditional acquisition of Giant
around vnode operations to conditional acquisition:

- Remove nfsrv_access_withgiant(), and cause nfsrv_access() to now
assert that Giant will be held if it is required for the vnode.

- Add nfsrv_fhtovp_locked(), which will drop the NFS server lock if
required, and modify nfsrv_fhtovp() to conditionally acquire
Giant if required.

- In the VOP's not dealing with more than one vnode at a time (i.e.,
not involving a lookup), conditionally acquire Giant.

This removes Giant use for MPSAFE file systems for a number of quite
important RPCs, including getattr, read, write. It leaves
unconditional Giant acquisitions in vnode operations that interact
with the name space or more than one vnode at a time as these
require further work.

Tested by: kris
Reviewed by: kib


# 164436 20-Nov-2006 pjd

Protect nfsm_srvpathsiz() call with the nfsd_mtx lock.

Reviewed by: mohans


# 163701 26-Oct-2006 kib

Fix leak in NAMEI zone caused by nfs server when VOP_RENAME fails.

Submitted by: Padma Bhooma <pbhooma at panasas com>
Reviewed by: bde
Approved by: pjd (mentor)
MFC after: 1 week


# 159268 05-Jun-2006 kib

Temporary workaround to prevent leak of Giant from nfsd when calling
lookup().

Reviewed by: tegge
Tested by: "Arno J. Klaassen" <arno at heho snv jussieu fr>, "Rong-en Fan" <grafan at gmail com>, Dmitriy Kirhlarov <dimma at higis ru>, Dmitry Pryanishnikov <dmitry at atlantis dp ua>
MFC after: 1 week
Approved by: kan, pjd (mentors)


# 157325 31-Mar-2006 jeff

- Release the references acquired by VOP_GETWRITEMOUNT and vfs_getvfs().

Discussed with: tegge
Tested by: kris
Sponsored by: Isilon Systems, Inc.


# 156586 12-Mar-2006 jeff

- Reorder vrele calls after vput calls to prevent lock order reversals
between leaf and directory locks.

Found by: kris
Sponsored by: Isilon Systems, Inc.


# 155160 31-Jan-2006 jeff

- Reorder calls to vrele() after calls to vput() when the vrele is a
directory. vrele() may lock the passed vnode, which in these cases would
give an invalid lock order of child -> parent. These situations are
deadlock prone although do not typically deadlock because the vrele
is typically not releasing the last reference to the vnode. Users of
vrele must consider it as a call to vn_lock() and order it appropriately.

MFC After: 1 week
Sponsored by: Isilon Systems, Inc.
Tested by: kkenn


# 154960 28-Jan-2006 csjp

Manage the ucred for the NFS server using the crget/crfree API defined in
kern_prot.c. This API handles reference counting among many other things.
Notably, if MAC is compiled into the kernel, it will properly initialize the
MAC labels when the ucred is allocated.

This work is in preparation for a new MAC entry point which will be responsible
for properly initializing policy specific labels for the NFS server credential.
Utilization of the crfree/crget APIs reduce the complexity associated with
this label's management.

Submitted by: green (with changes) [1]
Obtained from: TrustedBSD Project
Discussed with: rwatson, alfred

[1] I moved the ucred allocation outside the scope of the NFS server lock to
prevent M_WAIKOK allocations from occurring with non-sleep-able locks held.
Additionally, to reduce complexity, the ucred persist as long as the NFS
server descriptor.


# 154737 23-Jan-2006 trhodes

Revert my previous commit.

Proved I'm not that bright at times: jhb


# 154729 23-Jan-2006 trhodes

Fix indentation.

Prodded by: stefanf, ru, njl (in that order)


# 154629 21-Jan-2006 trhodes

Remove some dead code.

Found with: Coverity Prevent(tm)


# 151761 27-Oct-2005 glebius

Keep locks consistent before goto.

Reported by: pho
Reviewed by: mohans


# 145197 17-Apr-2005 rwatson

NFS write gathering defers execution of NFS server write requests to wait
to see if additional write requests will arrive that can be coalesced and
clustered with earlier ones. When doing so, it must determine whether
the two requests are made by credentials with the same access writes, so
as not to coalesce improperly. NFSW_SAMECRED() implements a test of two
credentials using a binary compare.

Replace NFSW_SAMECRED() macro with nfsrv_samecred() function, which is
aware of the contents and layout of a struct ucred, rather than a simple
binary compare. While the binary compare works when ucred is simply a
zero'd and embedded 'struct ucred' in the NFS descriptor, it will work
less well when the ucred associated with an NFS descriptor is "real", so
has defined and populated reference count, mutex, etc.

MFC after: 1 week
Obtained from: TrustedBSD Project


# 140770 24-Jan-2005 phk

Don't try to create vnode_pager objects on other filesystems vnodes,
either they did it themselves or it won't happen.


# 140495 19-Jan-2005 ps

Now that we have a non blocking version of nfsm_dissect(), change all the
nfsm_dissect() calls (done under the NFSD lock) to nfsm_dissect_nonblock().

Submitted by: Mohan Srinivasan


# 140048 11-Jan-2005 phk

Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC().

I'm not sure why a credential was added to these in the first place, it is
not used anywhere and it doesn't make much sense:

The credentials for syncing a file (ability to write to the
file) should be checked at the system call level.

Credentials for syncing one or more filesystems ("none")
should be checked at the system call level as well.

If the filesystem implementation needs a particular credential
to carry out the syncing it would logically have to the
cached mount credential, or a credential cached along with
any delayed write data.

Discussed with: rwatson


# 139823 06-Jan-2005 imp

/* -> /*- for license, minor formatting changes


# 137589 11-Nov-2004 rwatson

Correct a bug in nfsrv_create() where a call to nfsrv_access() might
be made holding the NFS server mutex. To clean this up, introduce a
version of the function, nfsrv_access_withgiant(), that expects the
NFS server mutex to already have been dropped and Giant acquired.
Wrap nfsrv_access() around this. This permits callers to more
efficiently check access if they're in a code block performing VFS
operations, and can be substitited for the nfsrv_access() call that
triggered this bug.

PR: 73807, 73208
MFC after: 1 week


# 136767 22-Oct-2004 phk

Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.

Initialize b_bufobj for all buffers.

Make incore() and gbincore() take a bufobj instead of a vnode.

Make inmem() local to vfs_bio.c

Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj)
also VI_MTX() to BO_MTX(),

Make buf_vlist_add() take a bufobj instead of a vnode.

Eliminate other uses of bp->b_vp where bp->b_bufobj will do.

Various minor polishing: remove "register", turn panic into KASSERT,
use new function declarations, TAILQ_FOREACH_SAFE() etc.


# 136662 18-Oct-2004 rwatson

Correct several instances where calls to vfs_getvfs() resulting in
failure in the NFS server would result in a leaked instance of the NFS
server subsystem lock. Liberally sprinkle assertions in all target
labels for error unwinding to assert the desired locking state.

RELENG_5_3 candidate.

MFC after: 3 days
Reported by: Wilkinson, Alex <alex dot wilkinson at dsto dot defence dot gov dot au>


# 134296 25-Aug-2004 rwatson

Convert a mtx_lock(&Giant) to a mtx_unlock(&Giant) in nfsrv_link() to
prevent leakage of Giant. With INVARIANTS, this results in an
assertion failure following execution of the RPC. Without INVARIANTS,
it could result in problems if the NFS server is killed causing nfsd
to return to user space holding Giant.

Feet provided by: brueffer


# 130640 17-Jun-2004 phk

Second half of the dev_t cleanup.

The big lines are:
NODEV -> NULL
NOUDEV -> NODEV
udev_t -> dev_t
udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.


# 129899 31-May-2004 rwatson

Release NFS subsystem lock and acquire Giant when calling into
vn_start_write().


# 129886 30-May-2004 rwatson

One more case where we want to drop the NFS server lock and acquire
Giant when entering VFS. Discovered by code inspection; still not
hit without debug.mpsafenet=1.

Reported by: bmilekic


# 129885 30-May-2004 rwatson

Acquire Giant around two more cases when calling into VFS to vput()
a vnode. Not bumped into with asserts in the main tree because we
run the NFS server with Giant by default. Discovered by inspection.

Complete annotations of Giant acquisition/release to note that it's
only because of VFS that we acquire Giant in most places in the NFS
server.


# 129841 29-May-2004 rwatson

Don't release Giant until after the call to vput() in nfsrv_setattr().
Unless running with debug.mpsafenet=1, this was not actually a problem.


# 129785 27-May-2004 rwatson

Call nfsm_clget_nolock() instead of nfsm_clget() when holding the NFS
subsystem lock to avoid tripping over an assertion regarding whether
the lock is held or not. This is likely to be the cause of a panic
tripped over by Andrea Campi.


# 129639 24-May-2004 rwatson

The socket code upcalls into the NFS server using the so_upcall
mechanism so that early processing on mbufs can be performed before
a context switch to the NFS server threads. Because of this, if
the socket code is running without Giant, the NFS server also needs
to be able to run the upcall code without relying on the presence on
Giant. This change modifies the NFS server to run using a "giant
code lock" covering operation of the whole subsystem. Work is in
progress to move to data-based locking as part of the NFSv4 server
changes.

Introduce an NFS server subsystem lock, 'nfsd_mtx', and a set of
macros to operate on the lock:

NFSD_LOCK_ASSERT() Assert nfsd_mtx owned by current thread
NFSD_UNLOCK_ASSERT() Assert nfsd_mtx not owned by current thread
NFSD_LOCK_DONTCARE() Advisory: this function doesn't care
NFSD_LOCK() Lock nfsd_mtx
NFSD_UNLOCK() Unlock nfsd_mtx

Constify a number of global variables/structures in the NFS server
code, as they are not modified and contain constants only:

nfsrvv2_procid nfsrv_nfsv3_procid nonidempotent
nfsv2_repstat nfsv2_type nfsrv_nfsv3_procid
nfsrvv2_procid nfsrv_v2errmap nfsv3err_null
nfsv3err_getattr nfsv3err_setattr nfsv3err_lookup
nfsv3err_access nfsv3err_readlink nfsv3err_read
nfsv3err_write nfsv3err_create nfsv3err_mkdir
nfsv3err_symlink nfsv3err_mknod nfsv3err_remove
nfsv3err_rmdir nfsv3err_rename nfsv3err_link
nfsv3err_readdir nfsv3err_readdirplus nfsv3err_fsstat
nfsv3err_fsinfo nfsv3err_pathconf nfsv3err_commit
nfsrv_v3errmap

There are additional structures that should be constified but due
to their being passed into general purpose functions without const
arguments, I have not yet converted.

In general, acquire nfsd_mtx when accessing any of the global NFS
structures, including struct nfssvc_sock, struct nfsd, struct
nfsrv_descript.

Release nfsd_mtx whenever calling into VFS, and acquire Giant for
calls into VFS. Giant is not required for any part of the
operation of the NFS server with the exception of calls into VFS.
Giant will never by acquired in the upcall code path. However, it
may operate entirely covered by Giant, or not. If debug.mpsafenet
is set to 0, the system calls will acquire Giant across all
operations, and the upcall will assert Giant. As such, by default,
this enables locking and allows us to test assertions, but should not
cause any substantial new amount of code to be run without Giant.
Bugs should manifest in the form of lock assertion failures for now.

This approach is similar (but not identical) to modifications to the
BSD/OS NFS server code snapshot provided by BSDi as part of their
SMPng snapshot. The strategy is almost the same (single lock over
the NFS server), but differs in the following ways:

- Our NFS client and server code bases don't overlap, which means
both fewer bugs and easier locking (thanks Peter!). Also means
NFSD_*() as opposed to NFS_*().

- We make broad use of assertions, whereas the BSD/OS code does not.

- Made slightly different choices about how to handle macros building
packets but operating with side effects.

- We acquire Giant only when entering VFS from the NFS server daemon
threads.

- Serious bugs in BSD/OS implementation corrected -- the snapshot we
received was clearly a work in progress.

Based on ideas from: BSDi SMPng Snapshot
Reviewed by: rick@snowhite.cis.uoguelph.ca
Extensive testing by: kris


# 128154 12-Apr-2004 mux

Don't send the available space as is in the FSSTAT call. Under
FreeBSD, we can have a negative available space value, but the
corresponding fields in the NFS protocol are unsigned. So
trnucate the value to 0 if it's negative, so that the client
doesn't receive absurdly high values.

Tested by: cognet


# 127977 07-Apr-2004 imp

Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson


# 126853 11-Mar-2004 phk

Properly vector all bwrite() and BUF_WRITE() calls through the same path
and s/BUF_WRITE()/bwrite()/ since it now does the same as bwrite().


# 121473 24-Oct-2003 phk

When grabbing vnodes to service NFS requests, make sure to call
vn_start_write() early to avoid snapshot deadlocks.

By: mckusick


# 116789 24-Jun-2003 iedowse

Fix a bug in nfsrv_read() that caused the replies to certain NFSv3
short read operations at the end of a file to not have the "eof"
flag set as they should. The problem is that the requested read
count was compared against the rounded-up reply data length instead
of the actual reply data length. This bug appears to have been
introduced in revision 1.78 (June 1999). It causes first-time reads
of certain file sizes (e.g 4094 bytes) to fail with EIO on a RedHat
9.0 NFSv3 client.

MFC after: 1 week


# 116657 21-Jun-2003 mckusick

Increase the size of the NFS server hash table to improve performance
when serving up more than about 32 active files. For details see
section 6.3 (pg 111) of Daniel Ellard and Margo Seltzer, ``NFS
Tricks and Benchmarking Traps'' in the Proceedings of the Usenix
2003 Freenix Track, June 9-14, 2003 pg 101-114.

Obtained from: Daniel Ellard <ellard@eecs.harvard.edu>
Sponsored by: DARPA & NAI Labs.


# 115301 25-May-2003 truckman

Beat vnode locking in the NFS server code into submission. This change
is not pretty, but it fixes the code so that it no longer violates the
vnode locking rules in the VFS API and doesn't trip any of the locking
assertions enabled by the DEBUG_VFS_LOCKS kernel configuration option.
There is one report that this patch fixed a "locking against myself"
panic on an NFS server that was tripped by a diskless client.

Approved by: re (scottl)


# 113955 24-Apr-2003 alc

- Acquire the vm_object's lock when performing vm_object_page_clean().
- Add a parameter to vm_pageout_flush() that tells vm_pageout_flush()
whether its caller has locked the vm_object. (This is a temporary
measure to bootstrap vm_object locking.)


# 112179 13-Mar-2003 jeff

- Lock bufs before inspecting their flags.


# 111463 25-Feb-2003 jeff

- Add an interlock argument to BUF_LOCK and BUF_TIMELOCK.
- Remove the buftimelock mutex and acquire the buf's interlock to protect
these fields instead.
- Hold the vnode interlock while locking bufs on the clean/dirty queues.
This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another
BUF_LOCK with a LK_TIMEFAIL to a single lock.

Reviewed by: arch, mckusick


# 111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# 108533 01-Jan-2003 schweikh

Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.


# 108356 28-Dec-2002 dillon

Abstract-out the constants for the sequential heuristic.

No operational changes.

MFC after: 1 day


# 107636 05-Dec-2002 iedowse

In the NFSv3 `fsinfo' procedure reply, don't claim that we support
32k read and write operations on datagram sockets when in fact we
reject requests larger than 16k. It must be the case that virtually
all clients use data sizes of 16k or less for UDP transport (FreeBSD's
client defaults to 8k and never exceeds 16k), as this bug has been
present ever since NFSv3 support was added.

Reported by: Senthil <lihtnes78@netscape.net>
Reviewed by: dillon
Approved by: re
MFC-after: 1 week


# 106264 31-Oct-2002 jeff

- Introduce a new macro, since that's what nfs loves, called
nfsm_srvpathsiz. This macro plucks a length out of an rpc request and
verifies that its size does not exceed NFS_MAXPATHLEN. If it does
it generates an ENAMETOOLONG response.
- Use this macro, and the existing nfsm_srvnamsiz macro in two places
where we deal with paths passed in by the client.

This fixes a linux interoperability bug. Linux was sending oversized path
components which would cause us to ignore the request all together. This
causes linux to hang indefinitly while it waits for a response. This
could still happen in other cases where we error out with EBADRPC.

Sponsored by: Isilon Systems, Inc.
Reviewed by: alfred, fabbri@isilon.com, neal@isilon.com


# 104424 03-Oct-2002 rwatson

Correct a problem wherein NFS servers running NFSv2 would not return
certain classes of failure responses to the client during a failed
remove operation.

Submitted by: Ian Dowse <iedowse@maths.tcd.ie>


# 103940 25-Sep-2002 jeff

- Use incore() instead of gbincore() so we don't have to acquire the
vnode interlock.


# 101308 04-Aug-2002 jeff

- Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
with VOP calls is needed.
- v_iflag is protected by interlock and is used for dealing with vnode
management issues. These flags include X/O LOCK, FREE, DOOMED, etc.
- All accesses to v_iflag and v_vflag have either been locked or marked with
mp_fixme's.
- Many ASSERT_VOP_LOCKED calls have been added where the locking was not
clear.
- Many functions in vfs_subr.c were restructured to provide for stronger
locking.

Idea stolen from: BSD/OS


# 99797 11-Jul-2002 dillon

Convert old style (type foo *)0 casts to NULLs

PR: kern/40360
Requested by: Hiten PAndya via direct email


# 99737 10-Jul-2002 dillon

Replace the global buffer hash table with per-vnode splay trees using a
methodology similar to the vm_map_entry splay and the VM splay that Alan
Cox is working on. Extensive testing has appeared to have shown no
increase in overhead.

Disadvantages
Dirties more cache lines during lookups.

Not as fast as a hash table lookup (but still N log N and optimal
when there is locality of reference).

Advantages
vnode->v_dirtyblkhd is now perfectly sorted, making fsync/sync/filesystem
syncer operate more efficiently.

I get to rip out all the old hacks (some of which were mine) that tried
to keep the v_dirtyblkhd tailq sorted.

The per-vnode splay tree should be easier to lock / SMPng pushdown on
vnodes will be easier.

This commit along with another that Alan is working on for the VM page
global hash table will allow me to implement ranged fsync(), optimize
server-side nfs commit rpcs, and implement partial syncs by the
filesystem syncer (aka filesystem syncer would detect that someone is
trying to get the vnode lock, remembers its place, and skip to the
next vnode).

Note that the buffer cache splay is somewhat more complex then other splays
due to special handling of background bitmap writes (multiple buffers with
the same lblkno in the same vnode), and B_INVAL discontinuities between the
old hash table and the existence of the buffer on the v_cleanblkhd list.

Suggested by: alc


# 96755 16-May-2002 trhodes

More s/file system/filesystem/g


# 95215 21-Apr-2002 iedowse

Limit to the maximum allowed reply size the amount of data that
nfsrv_readdir and nfsrv_readdirplus can return. A client request
containing an over-large `count' field could trigger the "Bad nfs
svc reply" panic in nfs_syscalls.c.

Spotted while trying to reproduce kern/37304, which turned out to
be fixed in FreeBSD a long time ago.

MFC after: 1 week


# 93593 01-Apr-2002 jhb

Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on: smp@


# 92462 16-Mar-2002 mckusick

Add a flags parameter to VFS_VGET to pass through the desired
locking flags when acquiring a vnode. The immediate purpose is
to allow polling lock requests (LK_NOWAIT) needed by soft updates
to avoid deadlock when enlisting other processes to help with
the background cleanup. For the future it will allow the use of
shared locks for read access to vnodes. This change touches a
lot of files as it affects most filesystems within the system.
It has been well tested on FFS, loopback, and CD-ROM filesystems.
only lightly on the others, so if you find a problem there, please
let me (mckusick@mckusick.com) know.


# 91406 27-Feb-2002 jhb

Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.


# 89367 14-Jan-2002 dillon

The vnode was not being vput()'d in the EEXIST mknod case on the nfs
server side. This can lead to a system deadlock.

Reviewed by: iedowse
Tested by: Alexey G Misurenko <mag@caravan.ru>, iedowse
Bug found with help by: Alexey G Misurenko <mag@caravan.ru>
MFC at: earliest convenience


# 89302 13-Jan-2002 iedowse

It is required by VOP_CREATE, VOP_MKNOD, VOP_SYMLINK and VOP_MKDIR
that va_mode of the supplied attributes is filled in with a valid
file mode (i.e not VNOVAL, and only ALLPERM bits set). However,
some NFS server op functions didn't guarantee this for all possible
request messages:

If a V3 client chose not include to a mode specification, we could
end up creating an ffs inode with mode 0177777, requiring a manual
fsck on the next reboot. Fix this by setting va_mode to 0 before
calling the VOP if a mode hasn't been supplied by the client.

In nfsrv_symlink(), S_IFMT bits supplied by a V2 client could end
up in the va_mode passed to VOP_SYMLINK with similar effects. We
now use the macro nfstov_mode() to correctly mask the bits.


# 89278 12-Jan-2002 iedowse

Fix a few NFSv2 issues that slipped in during the big cleanup. The
semantics of the nfsm_reply() macro were changed so that the caller
has to explicitly handle the V2 error case, whereas before,
nfsm_reply() did a `goto nfsmout' then. A few server ops (setattr,
readlink, create, mkdir) weren't updated to match, so errors in the
V2 case could cause protocol hangs and leaked mbufs.

Correct some comments that describe the old nfsm_reply behaviour.

[older, harmless nit] Remove the unnecessary `nfsmreply0' label in
nfsrv_create(), since for its users, the main `ereply' label does
the same thing.


# 89094 08-Jan-2002 msmith

Rename some variables that end up shadowing their namesakes in the NFS client
code.

Reviewed by: peter


# 88091 17-Dec-2001 iedowse

Avoid passing the variable `tl' to functions that just use it for
temporary storage. In the old NFS code it wasn't at all clear if
the value of `tl' was used across or after macro calls, but I'm
fairly confident that the convention was to keep its use local.
Each ex-macro function now uses a local version of this variable,
so all of the double-indirection goes away.

The only exception to the `local use' rule for `tl' is nfsm_clget(),
which is left unchanged by this commit.

Reviewed by: peter


# 87361 04-Dec-2001 iedowse

When VOP_SYMLINK fails, the value of *vpp is junk, so we must NULL
out nd.ni_vp to prevent the resource cleanup code at the end of
nfsrv_symlink from trying to vrele it. This fixes a "vrele: negative
ref cnt" panic that can occur when a symlink is attempted on an NFS
filesystem with no free space. Found locally, but the symptoms
correspond to those in the PR referenced below.

PR: kern/26878
MFC after: 3 days


# 85498 25-Oct-2001 iedowse

Now that nfsm_reply() does not usually set 'error' to 0, we need
to do it explicitly in nfsrv_noop so that the reply gets sent back
to the client. This fixes the generation of a selection of RPC
error replies (RPC_PROGMISMATCH, RPC_PROGUNAVAIL, RPC_PROCUNAVAIL
etc.) that are used by some clients to detect support for optional
protocols and features.

Reviewed by: peter
Reported by: Thomas Quinot <quinot@inf.enst.fr>
PR: kern/31479


# 84079 28-Sep-2001 peter

Unwind some more macros. NFSMADV() was kinda silly since it was right
next to equivalent m_len adjustments. Move the nfsm_subs.h macros
into groups depending on which phase they are used in, since that
affects the error recovery requirements. Collect some of the common error
checking into a single macro as preparation for unwinding some more.
Have nfs_rephead return a value instead of secretly modifying args.
Remove some unused function arguments that were being passed around.
Clarify nfsm_reply()'s error handling (I hope).


# 84057 27-Sep-2001 peter

Make nfsm_dissect() have an obvious return value.


# 84002 27-Sep-2001 peter

Tidy up nfsm_build usage. This is only partially finished.


# 83651 18-Sep-2001 peter

Cleanup and split of nfs client and server code.
This builds on the top of several repo-copies.


# 83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 76117 29-Apr-2001 grog

Revert consequences of changes to mount.h, part 2.

Requested by: bde


# 75858 23-Apr-2001 grog

Correct #includes to work with fixed sys/mount.h.


# 72645 18-Feb-2001 asmodai

Preceed/preceeding are not english words. Use precede and preceding.


# 72216 09-Feb-2001 iedowse

Fix some problems that were introduced in revision 1.97. Instead
of returning an error code to the caller, NFS server op routines
must themselves build an error reply and return 0 to the caller.

This is achieved by replacing the erroneous return statements with
code that jumps forward to the op function's reply code. We need
to be careful to ensure that the 'struct mount' pointer is NULL
though, so that the final vn_finished_write() call becomes a no-op.

Reviewed by: mckusick, dillon


# 70254 21-Dec-2000 bmilekic

* Rename M_WAIT mbuf subsystem flag to M_TRYWAIT.
This is because calls with M_WAIT (now M_TRYWAIT) may not wait
forever when nothing is available for allocation, and may end up
returning NULL. Hopefully we now communicate more of the right thing
to developers and make it very clear that it's necessary to check whether
calls with M_(TRY)WAIT also resulted in a failed allocation.
M_TRYWAIT basically means "try harder, block if necessary, but don't
necessarily wait forever." The time spent blocking is tunable with
the kern.ipc.mbuf_wait sysctl.
M_WAIT is now deprecated but still defined for the next little while.

* Fix a typo in a comment in mbuf.h

* Fix some code that was actually passing the mbuf subsystem's M_WAIT to
malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the
value of the M_WAIT flag, this could have became a big problem.


# 62976 11-Jul-2000 mckusick

Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).


# 60041 05-May-2000 phk

Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by: peter


# 59794 30-Apr-2000 phk

Remove unneeded #include <vm/vm_zone.h>

Generated by: src/tools/tools/kerninclude


# 58349 20-Mar-2000 phk

Rename the existing BUF_STRATEGY() to DEV_STRATEGY()

substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo)

substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo)

This patch is machine generated except for the ccd.c and buf.h parts.


# 54785 18-Dec-1999 dillon

Fix compilation warning on alpha when converting pointer to integer
to generate hash index.

Reviewed by: Andrew Gallatin <gallatin@cs.duke.edu>


# 54693 16-Dec-1999 dillon

Have NFS use a snapshot of boottime instead of boottime itself to
generate the NFSv3 Version id. boottime itself may change, sometimes
once every tick if you are running xntpd, which really throws off
clients. Clients will tend to throw away what they believe to be
stale data too often, and can get into long loops rewriting the same
data over and over again because they believe the server has rebooted
over and over again due to the changing version id.

Approved by: jkh


# 54655 15-Dec-1999 eivind

Introduce NDFREE (and remove VOP_ABORTOP)


# 54567 13-Dec-1999 dillon

Add a readahead heuristic to the NFS server side code. While the server
cannot unilaterally pass data to a client it can reduce the physical
disk transaction overhead by reading larger blocks. This results in
better pipelining of requests/responses over the network and an almost
100% increase in cpu efficiency on the server. On a 100BaseTX network
NFS read performance increases from 8.5 MBytes/sec to 10 MB/sec (maxed
out), and cpu efficiency increases from 72% idle to 80% idle on the server.

Reviewed by: Alfred Perlstein <bright@wintelcom.net>


# 54485 12-Dec-1999 dillon

Fix a number of server-side issues related to aborting badly formed
NFS packets, mainly initializing structure pointers to NULL which
are conditionally freed prior to return.

PR: kern/15249
Submitted by: Ian Dowse <iedowse@maths.tcd.ie>


# 53131 13-Nov-1999 eivind

Remove WILLRELE from VOP_SYMLINK

Note: Previous commit to these files (except coda_vnops and devfs_vnops)
that claimed to remove WILLRELE from VOP_RENAME actually removed it from
VOP_MKNOD.


# 53101 12-Nov-1999 eivind

Remove WILLRELE from VOP_RENAME


# 51796 29-Sep-1999 dillon

Make FreeBSD less conservative in determining when to return a cookie
error for a directory. I have made this change after a great deal of
review although I cannot be absolutely sure that this meets the spec.

The issue devolves into whether changes in an underlying (UFS) directory
can cause NFS directory blocks to be renumbered. My read of the code
indicates that NFS directory blocks will not be renumbered, which means
that the cookies should still remain valid after a change is made to
the underlying directory. This being the case, a cookie error should
not be returned when a change is made to the underlying directory and,
instead, the NFS client should rely on mtime detection to invalidate and
reload the directory.

The use of mtime is problematic in of itself, due to insufficient
resolution, which is why I believe the original conservative error
handling was done. Still, there have been dozens of bug reports by
people needing solaris<->FreeBSD interoperability and these have to
be accomodated.


# 51344 17-Sep-1999 dillon

Asynchronized client-side nfs_commit. NFS commit operations were
previously issued synchronously even if async daemons (nfsiod's) were
available. The commit has been moved from the strategy code to the doio
code in order to asynchronize it.

Removed use of lastr in preparation for removal of vnode->v_lastr. It
has been replaced with seqcount, which is already supported by the system
and, in fact, gives us a better heuristic for sequential detection then
lastr ever did.

Made major performance improvements to the server side commit. The
server previously fsync'd the entire file for each commit rpc. The
server now bawrite()s only those buffers related to the offset/size
specified in the commit rpc.

Note that we do not commit the meta-data yet. This works still needs
to be done.

Note that a further optimization can be done (and has not yet been done)
on the client: we can merge multiple potential commit rpc's into a
single rpc with a greater file offset/size range and greatly reduce
rpc traffic.

Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>


# 50477 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 49232 29-Jul-1999 wpaul

Correct the sanity test length calculation in nfsrv_readdirplus(): len is
being incremented by 4 bytes too few each time through the loop, which
allows more data into the mbuf chain that we really want. In the worst
case, when we're using 32K read/write sizes with a TCP client, this causes
readdirplus replies to sometimes exceed NFS_MAXPACKET which leads to a
panic. This problem cropped up for me using an IRIX 6.5.4 NFSv3 TCP client
with 32K read/write sizes, however supposedly it can be triggered by
WinNT NFS servers too. In theory, it can probably be triggered by any
NFS v3 implementation using TCP as long as it's using the maxiumum block
size.

Reviewed by: Matthew Dillon <dillon@backplane.com>


# 49158 28-Jul-1999 alc

Clear error in nfsrv_create when we have a valid reply so that
that reply is actually transmitted.
Submitted by: dillon


# 48859 17-Jul-1999 phk

I have not one single time remembered the name of this function correctly
so obviously I gave it the wrong name. s/umakedev/makeudev/g


# 48362 30-Jun-1999 julian

Submitted by: "David E. Cross" <crossd@cs.rpi.edu>
Matt missed a line..


# 48125 23-Jun-1999 julian

Matt's NFS fixes.
Submitted by: Matt Dillon
Reviewed by: David Cross, Julian Elischer, Mike Smith, Drew Gallatin
3.2 version to follow when tested


# 47751 05-Jun-1999 peter

Various changes lifted from the OpenBSD cvs tree:

txdr_hyper and fxdr_hyper tweaks to avoid excessive CPU order knowledge.

nfs_serv.c: don't call nfsm_adj() with negative values, windows clients
could crash servers when doing a readdir of a large directory.

nfs_socket.c: Use IP_PORTRANGE to get a priviliged port without a spin
loop trying to bind(). Don't clobber a mbuf pointer or we get panics
on a NFS3ERR_JUKEBOX error from a server when reusing a freed mbuf.

nfs_subs.c: Don't loose st_blocks on NFSv2 mounts when > 2GB.

Obtained from: OpenBSD


# 47028 11-May-1999 phk

Divorce "dev_t" from the "major|minor" bitmap, which is now called
udev_t in the kernel but still called dev_t in userland.

Provide functions to manipulate both types:
major() umajor()
minor() uminor()
makedev() umakedev()
dev2udev() udev2dev()

For now they're functions, they will become in-line functions
after one of the next two steps in this process.

Return major/minor/makedev to macro-hood for userland.

Register a name in cdevsw[] for the "filedescriptor" driver.

In the kernel the udev_t appears in places where we have the
major/minor number combination, (ie: a potential device: we
may not have the driver nor the device), like in inodes, vattr,
cdevsw registration and so on, whereas the dev_t appears where
we carry around a reference to a actual device.

In the future the cdevsw and the aliased-from vnode will be hung
directly from the dev_t, along with up to two softc pointers for
the device driver and a few houskeeping bits. This will essentially
replace the current "alias" check code (same buck, bigger bang).

A little stunt has been provided to try to catch places where the
wrong type is being used (dev_t vs udev_t), if you see something
not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if
it makes a difference. If it does, please try to track it down
(many hands make light work) or at least try to reproduce it
as simply as possible, and describe how to do that.

Without DEVT_FASCIST I belive this patch is a no-op.

Stylistic/posixoid comments about the userland view of the <sys/*.h>
files welcome now, from userland they now contain the end result.

Next planned step: make all dev_t's refer to the same devsw[] which
means convert BLK's to CHR's at the perimeter of the vnodes and
other places where they enter the game (bootdev, mknod, sysctl).


# 46568 06-May-1999 peter

Add sufficient braces to keep egcs happy about potentially ambiguous
if/else nesting.


# 46155 28-Apr-1999 phk

This Implements the mumbled about "Jail" feature.

This is a seriously beefed up chroot kind of thing. The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.

For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact: "real virtual servers".

Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.

Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.

It generally does what one would expect, but setting up a jail
still takes a little knowledge.

A few notes:

I have no scripts for setting up a jail, don't ask me for them.

The IP number should be an alias on one of the interfaces.

mount a /proc in each jail, it will make ps more useable.

/proc/<pid>/status tells the hostname of the prison for
jailed processes.

Quotas are only sensible if you have a mountpoint per prison.

There are no privisions for stopping resource-hogging.

Some "#ifdef INET" and similar may be missing (send patches!)

If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!

Tools, comments, patches & documentation most welcome.

Have fun...

Sponsored by: http://www.rndassociates.com/
Run for almost a year by: http://www.servetheweb.com/


# 46112 27-Apr-1999 phk

Suser() simplification:

1:
s/suser/suser_xxx/

2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.


# 44078 16-Feb-1999 dfr

* Change sysctl from using linker_set to construct its tree using SLISTs.
This makes it possible to change the sysctl tree at runtime.

* Change KLD to find and register any sysctl nodes contained in the loaded
file and to unregister them when the file is unloaded.

Reviewed by: Archie Cobbs <archie@whistle.com>,
Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)


# 41619 09-Dec-1998 eivind

Remove the if fixed in the last commit; bde quite correctly point out
that it can never fail.


# 41606 08-Dec-1998 eivind

Fix typo (; in "if (vp == NULL);").


# 40792 31-Oct-1998 peter

vm_object_page_clean() last arg changed from TRUE to OBJPC_SYNC. I'm not
sure that this is necessary to be a sync write here since a VOP_FSYNC()
follows and it will schedule, sort and complete the writes that the
vm_object_page_clean() started (as I think I understand things).


# 36735 07-Jun-1998 dfr

This commit fixes various 64bit portability problems required for
FreeBSD/alpha. The most significant item is to change the command
argument to ioctl functions from int to u_long. This change brings us
inline with various other BSD versions. Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.


# 36541 31-May-1998 peter

For the on-the-wire protocol, u_long -> u_int32_t; long -> int32_t;
int -> int32_t; u_short -> u_int16_t. Also, use mode_t instead of u_short
for storing modes (mode_t is a u_int16_t).

Obtained from: NetBSD


# 36539 31-May-1998 peter

Cut-n-paste glitch


# 36533 31-May-1998 peter

Hide whiteouts from NFS, since the protocol doesn't support them.

Obtained from: NetBSD


# 36532 31-May-1998 peter

NetBSD has a comment that Solaris 2.5 doesn't do verifiers correctly,
we have weakened this test already for Digital Unix, so it may be enough
for Solaris. It needs to be checked again.

Obtained from: NetBSD


# 36512 31-May-1998 peter

Refuse READDIR / READDIRPLUS rpc's for non-directories

Obtained from: NetBSD


# 36503 31-May-1998 peter

NFS Jumbo commit part 1. Cosmetic and structural changes only. The aim
of this part of commits is to minimize unnecessary differences between
the other NFS's of similar origin. Yes, there are gratuitous changes here
that the style folks won't like, but it makes the catch-up less difficult.


# 36473 30-May-1998 peter

When using NFSv3, use the remote server's idea of the maximum file size
rather than assuming 2^64. It may not like files that big. :-)
On the nfs server, calculate and report the max file size as the point
that the block numbers in the cache would turn negative.
(ie: 1099511627775 bytes (1TB)).

One of the things I'm worried about however, is that directory offsets
are really cookies on a NFSv3 server and can be rather large, especially
when/if the server generates the opaque directory cookies by using a local
filesystem offset in what comes out as the upper 32 bits of the 64 bit
cookie. (a server is free to do this, it could save byte swapping
depending on the native 64 bit byte order)

Obtained from: NetBSD


# 36251 20-May-1998 peter

Only ignore "owner" permissions selectively rather than always. In some
cases we ignore it (eg: read/write) to maintain chmod-after-open semantics
but in other cases we do care, eg: creating files, access() etc. Never
ignore errors from VOP_ACCESS() on immutable files.

This apparently comes from BSDI (from Keith Bostic) via NetBSD.

PR: 5148
Submitted by: Yoshiro MIHIRA <sanpei@yy.cs.keio.ac.jp>


# 35823 07-May-1998 msmith

In the words of the submitter:

---------
Make callers of namei() responsible for releasing references or locks
instead of having the underlying filesystems do it. This eliminates
redundancy in all terminal filesystems and makes it possible for stacked
transport layers such as umapfs or nullfs to operate correctly.

Quality testing was done with testvn, and lat_fs from the lmbench suite.

Some NFS client testing courtesy of Patrik Kudo.

vop_mknod and vop_symlink still release the returned vpp. vop_rename
still releases 4 vnode arguments before it returns. These remaining cases
will be corrected in the next set of patches.
---------

Submitted by: Michael Hancock <michaelh@cet.co.jp>


# 34961 30-Mar-1998 phk

Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time. Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by: bde


# 33181 09-Feb-1998 eivind

Staticize.


# 33134 06-Feb-1998 eivind

Back out DIAGNOSTIC changes.


# 33108 04-Feb-1998 eivind

Turn DIAGNOSTIC into a new-style option.


# 32937 31-Jan-1998 dyson

Change the busy page mgmt, so that when pages are freed, they
MUST be PG_BUSY. It is bogus to free a page that isn't busy,
because it is in a state of being "unavailable" when being
freed. The additional advantage is that the page_remove code
has a better cross-check that the page should be busy and
unavailable for other use. There were some minor problems
with the collapse code, and this plugs those subtile "holes."

Also, the vfs_bio code wasn't checking correctly for PG_BUSY
pages. I am going to develop a more consistant scheme for
grabbing pages, busy or otherwise. For now, we are stuck
with the current morass.


# 32071 28-Dec-1997 dyson

Lots of improvements, including restructring the caching and management
of vnodes and objects. There are some metadata performance improvements
that come along with this. There are also a few prototypes added when
the need is noticed. Changes include:

1) Cleaning up vref, vget.
2) Removal of the object cache.
3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore.
4) Correct some missing LK_RETRY's in vn_lock.
5) Correct the page range in the code for msync.

Be gentle, and please give me feedback asap.


# 32011 27-Dec-1997 bde

Unspammed nested include of <vm/vm_zone.h>.


# 30813 28-Oct-1997 bde

Removed unused #includes.


# 29653 21-Sep-1997 dyson

Change the M_NAMEI allocations to use the zone allocator. This change
plus the previous changes to use the zone allocator decrease the useage
of malloc by half. The Zone allocator will be upgradeable to be able
to use per CPU-pools, and has more intelligent usage of SPLs. Additionally,
it has reasonable stats gathering capabilities, while making most calls
inline.


# 29291 10-Sep-1997 phk

Remove a couple of stubborn NetBSD #if's.


# 29288 10-Sep-1997 phk

unifdef -U__NetBSD__ -D__FreeBSD__


# 29024 01-Sep-1997 bde

Added used #include - don't depend on <sys/mbuf.h> including
<sys/malloc.h> (unless we only use the bogusly shared M*WAIT flags).


# 28270 16-Aug-1997 wollman

Fix all areas of the system (or at least all those in LINT) to avoid storing
socket addresses in mbufs. (Socket buffers are the one exception.) A number
of kernel APIs needed to get fixed in order to make this happen. Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead. Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.


# 27608 22-Jul-1997 dfr

Allow NULL cookie verifiers for non-NULL offsets. This is needed for
Digital Unix boxes since they appear to always send null verifiers.


# 27446 16-Jul-1997 dfr

Merge WebNFS changes from NetBSD.

Obtained from: NetBSD


# 26637 14-Jun-1997 bde

Don't require superuser privileges for creating fifos. The v2 case was
broken when support for v3 was introduced in rev.1.16. The v3 case has
always been broken in FreeBSD.

Should be in 2.2.

PR: 3838


# 26418 03-Jun-1997 dfr

Implement the async mount option for NFSv3. This makes NFS pretend that all
writes sent to the server were synchronous and therefore no commits are
needed. This is the same as the vfs.nfs.async variable on the server but
allows each client to choose whether to work this way.

Also make the vfs.nfs.async variable do the 'right' thing for NFSv3, i.e.
pretend that the write was synchronous.


# 25664 10-May-1997 dfr

Implement a separate control for write gathering on NFSv3. This is turned
off for NFSv3 by default since write gathering seems to reduce performance
for NFSv3 by up to 60%.

Add sysctl knobs to control both variables.


# 25663 10-May-1997 dfr

Fix a nasty hang connected with write gathering. Also add debug print
statements to bits of the server which helped me find the hang.


# 24381 29-Mar-1997 bde

Removed #include of <ufs/ufs/dir.h>. Nfs no longer depends on any ufs
features, and the one thing that it depended on (DIRBLKSIZ) now has
conflicting spelling.


# 24250 25-Mar-1997 peter

Use the correct (relative to the implementation) ordering of args in
the VOP_LINK() calls, Closes PR#3064

Submitted by: bde


# 24249 25-Mar-1997 peter

The local fs interface does not allow link()/unlink() of directories,
do not allow a remote nfs client to cause local fs corruption either.


# 22975 22-Feb-1997 peter

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 22521 10-Feb-1997 dyson

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 18397 19-Sep-1996 nate

In sys/time.h, struct timespec is defined as:

/*
* Structure defined by POSIX.4 to be like a timeval.
*/
struct timespec {
time_t ts_sec; /* seconds */
long ts_nsec; /* and nanoseconds */
};

The correct names of the fields are tv_sec and tv_nsec.

Reminded by: James Drobina <jdrobina@infinet.com>


# 18041 05-Sep-1996 dg

Release an unneeded reference to a vnode that was gained in a VFS_VGET().
Fixes a readdirplus panic.

Submitted by: Doug Rabson <dfr@render.com>


# 18020 03-Sep-1996 bde

Eliminated nested include of <sys/unistd.h> in <sys/file.h> in the kernel.
Include it directly in the few places where it is used.

Reduced some #includes of <sys/file.h> to #includes of <sys/fcntl.h> or
nothing.


# 17761 21-Aug-1996 dyson

Even though this looks like it, this is not a complex code change.
The interface into the "VMIO" system has changed to be more consistant
and robust. Essentially, it is now no longer necessary to call vn_open
to get merged VM/Buffer cache operation, and exceptional conditions
such as merged operation of VBLK devices is simpler and more correct.

This code corrects a potentially large set of problems including the
problems with ktrace output and loaded systems, file create/deletes,
etc.

Most of the changes to NFS are cosmetic and name changes, eliminating
a layer of subroutine calls. The direct calls to vput/vrele have
been re-instituted for better cross platform compatibility.

Reviewed by: davidg


# 16226 08-Jun-1996 bde

Fixed a vnode reference leak in nfsrv_rename(). The target inode wasn't
released until the file system was unmounted. This bug also affected
kern/vfs_syscalls.c but was fixed in rev.1.18 and rev.1.20 there.

Reviewed by: davidg


# 15479 30-Apr-1996 bde

Fixed nfs sysctls. They missed out on the fs -> vfs name changes from
Lite2. This broke nfsstat.


# 13416 13-Jan-1996 phk

Add an option NFS_NOSERVER which saves 100K in the install kernel (or
any other kernel that uses it). Use with option NFS.


# 12911 17-Dec-1995 phk

Staticize.


# 12662 07-Dec-1995 dg

Untangled the vm.h include file spaghetti.


# 11921 29-Oct-1995 phk

Second batch of cleanup changes.
This time mostly making a lot of things static and some unused
variables here and there.


# 10224 24-Aug-1995 dg

Added NFS_ASYNC kernel option. It only has an effect for NFSv2.


# 10222 24-Aug-1995 dfr

Some fixes found using gcc -Wall:

nfsm_rpchead() has been called with the wrong number of args and misplaced
args since someone added new args in the middle for nfsv3.

Here's another one that would be important on 64-bit systems. VOP_READDIR
takes a `u_int **cookies' arg.

Submitted by: Bruce Evans <bde@zeta.org.au>


# 9966 06-Aug-1995 dg

Fixed bug where vnode_pager_uncache() wasn't always called when it should
be. The result was that the file's space wouldn't be properly freed when
it was deleted.

Submitted by: John Dyson


# 9877 03-Aug-1995 dfr

Slight changes to locking around VOP_READRIR.
Detect in nfsrv_readdirplus when a filesystem soes not support VFS_VGET and
return NFSERR_NOTSUPP so that the client will use ordinary readdir instead.


# 9854 02-Aug-1995 dfr

Lock the directory vnode before VOP_READDIR in nfsrv_readdirplus


# 9842 01-Aug-1995 dg

Removed my special-case hack for VOP_LINK and fixed the problem with the
wrong vp's ops vector being used by changing the VOP_LINK's argument order.
The special-case hack doesn't go far enough and breaks the generic
bypass routine used in some non-leaf filesystems. Pointed out by Kirk
McKusick.


# 9356 28-Jun-1995 dg

1) Converted v_vmdata to v_object.
2) Removed unnecessary vm_object_lookup()/pager_cache(object, TRUE) pairs
after vnode_pager_alloc() calls - the object is already guaranteed to be
persistent.
3) Removed some gratuitous casts.


# 9354 28-Jun-1995 dg

Fixed VOP_LINK argument order botch.


# 9336 27-Jun-1995 dfr

Changes to support version 3 of the NFS protocol.
The version 2 support has been tested (client+server) against FreeBSD-2.0,
IRIX 5.3 and FreeBSD-current (using a loopback mount). The version 2 support
is stable AFAIK.
The version 3 support has been tested with a loopback mount and minimally
against an IRIX 5.3 server. It needs more testing and may have problems.
I have patched amd to support the new variable length filehandles although
it will still only use version 2 of the protocol.

Before booting a kernel with these changes, nfs clients will need to at least
build and install /usr/sbin/mount_nfs. Servers will need to build and
install /usr/sbin/mountd.

NFS diskless support is untested.

Obtained from: Rick Macklem <rick@snowhite.cis.uoguelph.ca>


# 9202 11-Jun-1995 rgrimes

Merge RELENG_2_0_5 into HEAD


# 8876 30-May-1995 rgrimes

Remove trailing whitespace.


# 8832 29-May-1995 dg

Fixed some serious bugs that resulted in object reference counts not being
handled correctly. This would manifest itself as "object deallocated too
many times" panics and perhaps other strange inconsistencies on NFS servers.

Reviewed by: me, of course
Submitted by: John Dyson


# 7160 19-Mar-1995 dg

Removed unnecessary call to vnode_pager_uncache(). We automatically clear
the VTEXT flag after all mappers have finished with the object.


# 7110 17-Mar-1995 dg

Changed some (incorrect) nfsrv_vput()'s back into regular vput()'s. This
fixes the last of the known NQNFS problems (until I find more, that is :-)).


# 6416 15-Feb-1995 dg

Woops, change a nfsrv_vput back into a nfsrv_vrele.

Submitted by: John Dyson


# 6414 15-Feb-1995 dg

Fixed three bugs related to the merged cache changes. The bugs likely would
make NFS servers flakey - probably the cause of freefall's recent hangs.

Submitted by: John Dyson


# 5455 09-Jan-1995 dg

These changes embody the support of the fully coherent merged VM buffer cache,
much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.

The majority of the merged VM/cache work is by John Dyson.

The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.

vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme. The scheme is almost fully compatible with the old filesystem
interface. Significant improvement in the number of opportunities for write
clustering.

vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.

vm_object.c:
Yet more improvements in the collapse code. Elimination of some windows that
can cause list corruption.

vm_pageout.c:
Fixed it, it really works better now. Somehow in 2.0, some "enhancements"
broke the code. This code has been reworked from the ground-up.

vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.

pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.

vm_glue.c
Much simpler and more effective swapping code. No more gratuitous swapping.

proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.

swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the
code doesn't need it anymore.

machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.

machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.

ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on
busy buffers.

Submitted by: John Dyson and David Greenman


# 3305 02-Oct-1994 phk

Prototyping and general gcc-shutting up. Gcc has one warning now which looks
bad, I will get to it eventually, unless somebody beats me to it.


# 3167 28-Sep-1994 dfr

Make NFS ask the filesystems for directory cookies instead of making them
itself.


# 2979 22-Sep-1994 wollman

More loadable VFS changes:

- Make a number of filesystems work again when they are statically compiled
(blush)

- FIFOs are no longer optional; ``options FIFO'' removed from distributed
config files.


# 2384 29-Aug-1994 dg

"bogus" fixes from 1.1.5 to work around some cache coherency problems.


# 1817 02-Aug-1994 dg

Added $Id$


# 1549 25-May-1994 rgrimes

The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.

Reviewed by: Rodney W. Grimes
Submitted by: John Dyson and David Greenman


# 1541 24-May-1994 rgrimes

BSD 4.4 Lite Kernel Sources