Cross Reference: /freebsd-10.0-release/sys/fs/nullfs/

History log of /freebsd-10.0-release/sys/fs/nullfs/
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
259065	07-Dec-2013	gjb	- Copy stable/10 (r259064) to releng/10.0 as part of the 10.0-RELEASE cycle. - Update __FreeBSD_version [1] - Set branch name to -RC1 [1] 10.0-CURRENT __FreeBSD_version value ended at '55', so start releng/10.0 at '100' so the branch is started with a value ending in zero. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation /freebsd-10.0-release/sys/conf/newvers.sh /freebsd-10.0-release/sys/sys/param.h
256281	10-Oct-2013	gjb	Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
255442	10-Sep-2013	des	Fix the length calculation for the final block of a sendfile(2) transmission which could be tricked into rounding up to the nearest page size, leaking up to a page of kernel memory. [13:11] In IPv6 and NetATM, stop SIOCSIFADDR, SIOCSIFBRDADDR, SIOCSIFDSTADDR and SIOCSIFNETMASK at the socket layer rather than pass them on to the link layer without validation or credential checks. [SA-13:12] Prevent cross-mount hardlinks between different nullfs mounts of the same underlying filesystem. [SA-13:13] Security: CVE-2013-5666 Security: FreeBSD-SA-13:11.sendfile Security: CVE-2013-5691 Security: FreeBSD-SA-13:12.ifioctl Security: CVE-2013-5710 Security: FreeBSD-SA-13:13.nullfs Approved by: re
252714	04-Jul-2013	kib	The tvp vnode on rename is usually unlinked. Drop the cached null vnode for tvp to allow the free of the lower vnode, if needed. PR: kern/180236 Tested by: smh Sponsored by: The FreeBSD Foundation MFC after: 1 week
250852	21-May-2013	kib	Do not leak the NULLV_NOUNLOCK flag from the nullfs_unlink_lowervp(), for the case when the nullfs vnode is not reclaimed. Otherwise, later reclamation would not unlock the lower vnode. Reported by: antoine Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
250505	11-May-2013	kib	- Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). The null_hashget() obtains the reference on the nullfs vnode, which must be dropped. - Fix a wart which existed from the introduction of the nullfs caching, do not unlock lower vnode in the nullfs_reclaim_lowervp(). It should be innocent, but now it is also formally safe. Inform the nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on nullfs inode. - Add a callback to the upper filesystems for the lower vnode unlinking. When inactivating a nullfs vnode, check if the lower vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC on the lower vnode, and reclaim upper vnode if so. This allows nullfs to purge cached vnodes for the unlinked lower vnode, avoiding excessive caching. Reported by: G??ran L??wkrantz <goran.lowkrantz@ismobile.com> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
247619	02-Mar-2013	jilles	nullfs: Improve f_flags in statfs(). Include some flags of the nullfs mount itself: MNT_RDONLY, MNT_NOEXEC, MNT_NOSUID, MNT_UNION, MNT_NOSYMFOLLOW. This allows userland code calling statfs() or fstatfs() to see these flags. In particular, this allows opendir() to detect that a -t nullfs -o union mount needs deduplication (otherwise at least . and .. are returned twice) and allows rtld to detect a -t nullfs -o noexec mount as noexec. Turn off the MNT_ROOTFS flag from the underlying filesystem because the nullfs mount is definitely not the root filesystem. Reviewed by: kib MFC after: 1 week
245495	16-Jan-2013	kib	Remove the filtering of the acceptable mount options for nullfs, added in r245004. Although the report was for noatime option which is non-functional for the nullfs, other standard options like nosuid or noexec are useful with it. Reported by: Dewayne Geraghty <dewayne.geraghty@heuristicsystems.com.au> MFC after: 3 days
245408	14-Jan-2013	kib	The current default size of the nullfs hash table used to lookup the existing nullfs vnode by the lower vnode is only 16 slots. Since the default mode for the nullfs is to cache the vnodes, hash has extremely huge chains. Size the nullfs hashtbl based on the current value of desiredvnodes. Use vfs_hash_index() to calculate the hash bucket for a given vnode. Pointy hat to: kib Diagnosed and reviewed by: peter Tested by: peter, pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 5 days
245262	10-Jan-2013	kib	When nullfs mount is forcibly unmounted and nullfs vnode is reclaimed, get back the leased write reference from the lower vnode. There is no other path which can correct v_writecount on the lowervp. Reported by: flo Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days
245033	04-Jan-2013	kib	Fix reversed condition in the assertion. Pointy hat to: kib MFC after: 13 days
245004	03-Jan-2013	kib	Add the "nocache" nullfs mount option, which disables the caching of the free nullfs vnodes, switching nullfs behaviour to pre-r240285. The option is mostly intended as the last-resort when higher pressure on the vnode cache due to doubling of the vnode counts is not desirable. Note that disabling the cache costs more than 2x wall time in the metadata-hungry scenarious. The default is "cache". Tested and benchmarked by: pho (previous version) MFC after: 2 weeks
243340	20-Nov-2012	kib	Remove the check and panic for an impossible condition. The NULL lowervp vnode v_vnlock would cause panic due to NULL pointer dereference much earlier. MFC after: 1 week
243311	19-Nov-2012	attilio	r16312 is not any longer real since many years (likely since when VFS received granular locking) but the comment present in UFS has been copied all over other filesystems code incorrectly for several times. Removes comments that makes no sense now. Reviewed by: kib MFC after: 3 days
242833	09-Nov-2012	attilio	Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag. Porters should refer to __FreeBSD_version 1000021 for this change as it may have happened at the same timeframe.
242476	02-Nov-2012	kib	The r241025 fixed the case when a binary, executed from nullfs mount, was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks
241554	14-Oct-2012	kib	Grammar fixes. Submitted by: bf MFC after: 1 week
241548	14-Oct-2012	kib	Replace the XXX comment with the proper description. MFC after: 1 week
240285	09-Sep-2012	kib	Allow shared lookups for nullfs mounts, if lower filesystem supports it. There are two problems which shall be addressed for shared lookups use to have measurable effect on nullfs scalability: 1. When vfs_lookup() calls VOP_LOOKUP() for nullfs, which passes lookup operation to lower fs, resulting vnode is often only shared-locked. Then null_nodeget() cannot instantiate covering vnode for lower vnode, since insmntque1() and null_hashins() require exclusive lock on the lower. Change the assert that lower vnode is exclusively locked to only require any lock. If null hash failed to find pre-existing nullfs vnode for lower vnode and the vnode is shared-locked, the lower vnode lock is upgraded. 2. Nullfs reclaims its vnodes on deactivation. This is due to nullfs inability to detect reclamation of the lower vnode. Reclamation of a nullfs vnode at deactivation time prevents a reference to the lower vnode to become stale. Change nullfs VOP_INACTIVE to not reclaim the vnode, instead use the VFS_RECLAIM_LOWERVP to get notification and reclaim upper vnode together with the reclamation of the lower vnode. Note that nullfs reclamation procedure calls vput() on the lowervp vnode, temporary unlocking the vnode being reclaimed. This seems to be fine for MPSAFE filesystems, but not-MPSAFE code often put partially initialized vnode on some globally visible list, and later can decide that half-constructed vnode is not needed. If nullfs mount is created above such filesystem, then other threads might catch such not properly initialized vnode. Instead of trying to overcome this case, e.g. by recursing the lower vnode lock in null_reclaim_lowervp(), I decided to rely on nearby removal of the support for non-MPSAFE filesystems. In collaboration with: pho MFC after: 3 weeks
234607	23-Apr-2012	trasz	Remove unused thread argument to vrecycle(). Reviewed by: kib
232918	13-Mar-2012	kevlo	Use NULL instead of 0
232383	02-Mar-2012	kib	Do not expose unlocked unconstructed nullfs vnode on mount list. Lock the native nullfs vnode lock before switching the locks. Tested by: pho MFC after: 1 week
232305	29-Feb-2012	kib	Allow shared locks for reads when lower filesystem accept shared locking. Tested by: pho MFC after: 1 week
232304	29-Feb-2012	kib	Document that null_nodeget() cannot take shared-locked lowervp due to insmntque() requirements. Tested by: pho MFC after: 1 week
232303	29-Feb-2012	kib	In null_reclaim(), assert that reclaimed vnode is fully constructed, instead of accepting half-constructed vnode. Previous code cannot decide what to do with such vnode anyway, and although processing it for hash removal, paniced later when getting rid of nullfs reference on lowervp. While there, remove initializations from the declaration block. Tested by: pho MFC after: 1 week
232301	29-Feb-2012	kib	Always request exclusive lock for the lower vnode in nullfs_vget(). The null_nodeget() requires exclusive lock on lowervp to be able to insmntque() new vnode. Reported by: rea Tested by: pho MFC after: 1 week
232299	29-Feb-2012	kib	Move the code to destroy half-contructed nullfs vnode into helper function null_destroy_proto() from null_insmntque_dtr(). Also apply null_destroy_proto() in null_nodeget() when we raced and a vnode is found in the hash, so the currently allocated protonode shall be destroyed. Lock the vnode interlock around reassigning the v_vnlock. In fact, this path will not be exercised after several later commits, since null_nodeget() cannot take shared-locked lowervp at all due to insmntque() requirements. Reported by: rea Tested by: pho MFC after: 1 week
232296	29-Feb-2012	kib	Merge a split multi-line comment. MFC after: 1 week
232059	23-Feb-2012	mm	To improve control over the use of mount(8) inside a jail(8), introduce a new jail parameter node with the following parameters: allow.mount.devfs: allow mounting the devfs filesystem inside a jail allow.mount.nullfs: allow mounting the nullfs filesystem inside a jail Both parameters are disabled by default (equals the behavior before devfs and nullfs in jails). Administrators have to explicitly allow mounting devfs and nullfs for each jail. The value "-1" of the devfs_ruleset parameter is removed in favor of the new allow setting. Reviewed by: jamie Suggested by: pjd MFC after: 2 weeks
231269	09-Feb-2012	mm	Allow mounting nullfs(5) inside jails. This is now possible thanks to r230129. MFC after: 1 month
230304	18-Jan-2012	rea	Subject: NULLFS: properly destroy node hash Use hashdestroy() instead of naive free(). Approved by: kib MFC after: 2 weeks
229600	05-Jan-2012	dim	In sys/fs/nullfs/null_subr.c, in a KASSERT, output the correct vnode pointer 'lowervp' instead of 'vp', which is uninitialized at that point. Reviewed by: kib MFC after: 1 week
229431	03-Jan-2012	kib	Do the vput() for the lowervp in the null_nodeget() for error case too. Several callers of null_nodeget() did the cleanup itself, but several missed it, most prominent being null_bypass(). Remove the cleanup from the callers, now null_nodeget() handles lowervp free itself. Reported and tested by: pho MFC after: 1 week
229428	03-Jan-2012	kib	Document the state of the lowervp vnode for null_nodeget(). Tested by: pho MFC after: 1 week
227697	19-Nov-2011	kib	Existing VOP_VPTOCNP() interface has a fatal flow that is critical for nullfs. The problem is that resulting vnode is only required to be held on return from the successfull call to vop, instead of being referenced. Nullfs VOP_INACTIVE() method reclaims the vnode, which in combination with the VOP_VPTOCNP() interface means that the directory vnode returned from VOP_VPTOCNP() is reclaimed in advance, causing vn_fullpath() to error with EBADF or like. Change the interface for VOP_VPTOCNP(), now the dvp must be referenced. Convert all in-tree implementations of VOP_VPTOCNP(), which is trivial, because vhold(9) and vref(9) are similar in the locking prerequisites. Out-of-tree fs implementation of VOP_VPTOCNP(), if any, should have no trouble with the fix. Tested by: pho Reviewed by: mckusick MFC after: 3 weeks (subject of re approval)
227696	19-Nov-2011	kib	Do not use NULLVPTOLOWERVP() in the null_print(). If diagnostic is compiled in, and show vnode is used from ddb on the faulty nullfs vnode, we get panic instead of vnode dump. MFC after: 1 week
227695	19-Nov-2011	kib	Use the plain panic calls, without additional printing around them. The debugger and dumping support is adequate. Tested by: pho MFC after: 1 week
226688	24-Oct-2011	kib	The use of VOP_ISLOCKED() without a check for the return values can cause false positives. Replace the #ifdef block with the proper ASSERT_VOP_UNLOCKED() assert. Tested by: pho MFC after: 1 week
226687	24-Oct-2011	kib	The only possible error return from null_nodeget() is due to insmntque1 failure (the getnewvnode cannot return an error). In this case, the null_insmntque_dtr() already unlocked the reclaimed vnode, so VOP_UNLOCK() in the nullfs_mount() after null_nodeget() failure is wrong. Tested by: pho MFC after: 1 week
226686	24-Oct-2011	kib	The covered vnode must be reloced if it was unlocked. Remove VOP_ISLOCKED test because of this and also because it can lead to false positives. Tested by: pho MFC after: 1 week
226681	24-Oct-2011	pho	Only unlock if the lock is exclusive. Reported by: Subbsd <subbsd gmail com> Discussed with: kib
222167	22-May-2011	rmacklem	Add a lock flags argument to the VFS_FHTOVP() file system method, so that callers can indicate the minimum vnode locking requirement. This will allow some file systems to choose to return a LK_SHARED locked vnode when LK_SHARED is specified for the flags argument. This patch only adds the flag. It does not change any file system to use it and all callers specify LK_EXCLUSIVE, so file system semantics are not changed. Reviewed by: kib
218965	23-Feb-2011	brucec	Fix typos - remove duplicate "is". PR: docs/154934 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days
212043	31-Aug-2010	rmacklem	Add a null_remove() function to nullfs, so that the v_usecount of the lower level vnode is incremented to greater than 1 when the upper level vnode's v_usecount is greater than one. This is necessary for the NFS clients, so that they will do a silly rename of the file instead of actually removing it when the file is still in use. It is "racy", since the v_usecount is incremented in many places in the kernel with minimal synchronization, but an extraneous silly rename is preferred to not doing a silly rename when it is required. The only other file systems that currently check the value of v_usecount in their VOP_REMOVE() functions are nwfs and smbfs. These file systems choose to fail a remove when the v_usecount is greater than 1 and I believe will function more correctly with this patch, as well. Tested by: to.my.trociny at gmail.com Submitted by: to.my.trociny at gmail.com (earlier version) Reviewed by: kib MFC after: 2 weeks
208128	16-May-2010	kib	Disable bypass for the vop_advlockpurge(). The vop is called after vop_revoke(), the v_data is already destroyed. Reported and tested by: ed
194601	21-Jun-2009	kib	Add explicit struct ucred * argument for VOP_VPTOCNP, to be used by vn_open_cred in default implementation. Valid struct ucred is needed for audit and MAC, and curthread credentials may be wrong. This further requires modifying the interface of vn_fullpath(9), but it is out of scope of this change. Reviewed by: rwatson
193175	31-May-2009	kib	Implement the bypass routine for VOP_VPTOCNP in nullfs. Among other things, this makes procfs <pid>/file working for executables started from nullfs mount. Tested by: pho PR: 94269, 104938
193173	31-May-2009	kib	Do not drop vnode interlock in null_checkvp(). null_lock() verifies that v_data is not-null before calling NULLVPTOLOWERVP(), and dropping the interlock allows for reclaim to clean v_data and free the memory. While there, remove unneeded semicolons and convert the infinite loops to panics. I have a will to remove null_checkvp() altogether, or leave it as a trivial stub, but not now. Reported and tested by: pho
193172	31-May-2009	kib	Lock the real null vnode lock before substitution of vp->v_vnlock. This should not really matter for correctness, since vp->v_lock is not locked before the call, and null_lock() holds the interlock, but makes the control flow for reclaim more clear. Tested by: pho
193092	30-May-2009	trasz	Add VOP_ACCESSX, which can be used to query for newly added V* permissions, such as VWRITE_ACL. For a filsystems that don't implement it, there is a default implementation, which works as a wrapper around VOP_ACCESS. Reviewed by: rwatson@
191990	11-May-2009	attilio	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.
189961	18-Mar-2009	pho	Do not use null_bypass for VOP_ISLOCKED, directly call default implementation. null_bypass cannot work for the !nullfs-vnodes, in particular, for VBAD vnodes. In collaboration with: kib
189758	13-Mar-2009	attilio	Remove the null_islocked() overloaded vop because the standard one does the same.
189622	10-Mar-2009	kib	Do not use bypass for vop_vptocnp() from nullfs, call standard implementation instead. The bypass does not assume that returned vnode is only held. Reported by: Paul B. Mahol <onemda gmail com>, pluknet <pluknet gmail com> Reviewed by: jhb Tested by: pho, pluknet <pluknet gmail com>
187959	31-Jan-2009	bz	Remove unused local variables. Submitted by: Christoph Mallon christoph.mallon@gmx.de Reviewed by: kib MFC after: 2 weeks
185335	26-Nov-2008	kib	In null_lookup(), do the needed cleanup instead of panicing saying the cleanup is needed. Reported by: kris, pho Tested by: pho MFC after: 2 weeks
184413	28-Oct-2008	trasz	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)
184205	23-Oct-2008	des	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months
182943	11-Sep-2008	ed	Fix two small typo's in comments in the nullfs vnops code. Submitted by: Jille Timmermans <jille quis cx>
177785	31-Mar-2008	kib	Add the support for the AT_FDCWD and fd-relative name lookups to the namei(9). Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho
177725	29-Mar-2008	jeff	- Simplify null_hashget() and null_hashins() by using vref() rather than a complex series of steps involving vget() without a lock type to emulate the same thing.
176559	25-Feb-2008	attilio	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>
176116	08-Feb-2008	attilio	Conver all explicit instances to VOP_ISLOCKED(arg, NULL) into VOP_ISLOCKED(arg, curthread). Now, VOP_ISLOCKED() and lockstatus() should only acquire curthread as argument; this will lead in axing the additional argument from both functions, making the code cleaner. Reviewed by: jeff, kib
175635	24-Jan-2008	attilio	Cleanup lockmgr interface and exported KPI: - Remove the "thread" argument from the lockmgr() function as it is always curthread now - Axe lockcount() function as it is no longer used - Axe LOCKMGR_ASSERT() as it is bogus really and no currently used. Hopefully this will be soonly replaced by something suitable for it. - Remove the prototype for dumplockinfo() as the function is no longer present Addictionally: - Introduce a KASSERT() in lockstatus() in order to let it accept only curthread or NULL as they should only be passed - Do a little bit of style(9) cleanup on lockmgr.h KPI results heavilly broken by this change, so manpages and FreeBSD_version will be modified accordingly by further commits. Tested by: matteo
175294	13-Jan-2008	attilio	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
175202	10-Jan-2008	attilio	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
172697	16-Oct-2007	alfred	Get rid of qaddr_t. Requested by: bde
172644	14-Oct-2007	daichi	This changes give nullfs correctly work with latest unionfs. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
170093	29-May-2007	rwatson	Where I previously removed calls to kdb_enter(), now remove include of kdb.h. Pointed out by: bde
170014	27-May-2007	rwatson	Rather than entering the debugger via kdb_enter() in the event the root vnode is unexpectedly locked under NULLFS_DEBUG in nullfs and then returning EDEADLK, panic.
169671	18-May-2007	kib	Since renaming of vop_lock to _vop_lock, pre- and post-condition function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.
167497	13-Mar-2007	tegge	Make insmntque() externally visibile and allow it to fail (e.g. during late stages of unmount). On failure, the vnode is recycled. Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure. Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE. Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility. Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup. Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib
166774	15-Feb-2007	pjd	Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs
164248	13-Nov-2006	kmacy	change vop_lock handling to allowing tracking of callers' file and line for acquisition of lockmgr locks Approved by: scottl (standing in for mentor rwatson)
162647	26-Sep-2006	tegge	Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag. This eliminates a race where MNT_UPDATE flag could be lost when nmount() raced against sync(), sync_fsync() or quotactl().
159023	28-May-2006	rodrigc	Remove incorrect null_checkexp() routine. This will allow the NFS server to call vfs_stdcheckexp() on the exported nullfs filesystem, not the underlying filesystem being nullfs mounted. If the lower filesystem was not NFS exported, then the NFS exported null filesystem would not work. Pointed out by: scottl PR: kern/87906 MFC after: 1 week
159019	28-May-2006	rodrigc	Modify MNT_UPDATE behavior for nullfs so that it does not return EOPNOTSUPP if an "export" parameter was passed in. This should allow nullfs mounts to be NFS exported. PR: kern/87906 MFC after: 1 week
156585	12-Mar-2006	jeff	- Define a null_getwritemount to get the mount-point for the lower filesystem so that nullfs doesn't permit you to circumvent snapshots. Discussed with: tegge Sponsored by: Isilon Systems, Inc.
155899	22-Feb-2006	jeff	- spell VOP_LOCK(vp, LK_RELEASE... VOP_UNLOCK(vp,... so that asserts in vop_lock_post do not trigger. - Rearrange null_inactive to null_hashrem earlier so there is no chance of finding the null node on the hash list after the locks have been switched. - We should never have a NULL lowervp in null_reclaim() so there is no need to handle this situation. panic instead. MFC After: 1 week
155898	22-Feb-2006	jeff	- Assert that the lowervp is locked in null_hashget(). - Simplify the logic dealing with recycled vnodes in null_hashget() and null_hashins(). Since we hold the lower node locked in both cases the null node can not be undergoing recycling unless reclaim somehow called null_nodeget(). The logic that was in place was not safe and was essentially dead code. MFC After: 1 week
155508	10-Feb-2006	jhb	Correctly set MNTK_MPSAFE flag from the lower vnode's mount rather than always turning it on along with any flags set in the lower mount. Tested by: kris Reviewed by: jeff MFC after: 3 days
155423	07-Feb-2006	jeff	- No need to WANTPARENT when we're just going to vrele it in a deadlock prone way later. Reported by: kkenn MFC After: 3 days
153400	14-Dec-2005	des	Eradicate caddr_t from the VFS API.
151897	31-Oct-2005	rwatson	Normalize a significant number of kernel malloc type names: - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
151393	16-Oct-2005	kris	Reflect mpsafety of the underlying filesystem in the nullfs image. I benchmarked this by simultaneously extracting 4 large tarballs (basically world images) on a 4-processor AMD64 system, in a malloc-backed md. With this patch, system time was reduced by 43%, and wall clock time by 33%. Submitted by: jeff MFC after: 1 week
150181	15-Sep-2005	kan	Handle a race condition where NULLFS vnode can be cleaned while threads can still be asleep waiting for lowervp lock. Tested by: kkenn Discussed with: ssouhlal, jeffr
149722	02-Sep-2005	ssouhlal	Use vput() instead of vrele() in null_reclaim() since the lower vnode is locked. MFC after: 3 days
145424	22-Apr-2005	jeff	- As this is presently the one and only place where duplicate acquires of the vnode interlock are allowed mark it by passing MTX_DUPOK to this lock operation only. Sponsored by: Isilon Systems, Inc.
144904	11-Apr-2005	jeff	- Clear VI_OWEINACT before calling vget() with no lock type. We know the node is actually already locked, and VOP_INACTIVE is not desirable in this case.
144903	11-Apr-2005	jeff	- Honor the flags argument passed to null_root(). The filesystem below us will decide whether or not to grab a real shared lock.
144058	24-Mar-2005	jeff	- Update vfs_root implementations to match the new prototype. None of these filesystems will support shared locks until they are explicitly modified to do so. Careful review must be done to ensure that this is safe for each individual filesystem. Sponsored by: Isilon Systems, Inc.
143744	17-Mar-2005	jeff	- Lock the clearing of v_data so it is safe to inspect it with the interlock. Sponsored by: Isilon Systems, Inc.
143642	15-Mar-2005	jeff	- Assume that all lower filesystems now support proper locking. Assert that they set v->v_vnlock. This is true for all filesystems in the tree. - Remove all uses of LK_THISLAYER. If the lower layer is locked, the null layer is locked. We only use vget() to get a reference now. null essentially does no locking. This fixes LOOKUP_SHARED with nullfs. - Remove the special LK_DRAIN considerations, I do not believe this is needed now as LK_DRAIN doesn't destroy the lower vnode's lock, and it's hardly used anymore. - Add one well commented hack to prevent the lowervp from going away while we're in it's VOP_LOCK routine. This can only happen if we're forcibly unmounted while some callers are waiting in the lock. In this case the lowervp could be recycled after we drop our last ref in null_reclaim(). Prevent this with a vhold().
143630	15-Mar-2005	jeff	- We have to transfer lockers after reseting our vnlock pointer. Sponsored by: Isilon Systems, Inc.
143513	13-Mar-2005	jeff	- The VI_DOOMED flag now signals the end of a vnode's relationship with the filesystem. Check that rather than VI_XLOCK. - VOP_INACTIVE should no longer drop the vnode lock. - The vnode lock is required around calls to vrecycle() and vgone(). Sponsored by: Isilon Systems, Inc.
142011	17-Feb-2005	phk	Introduce vx_wait{l}() and use it instead of home-rolled versions.
141447	07-Feb-2005	phk	Remove vop_destroyvobject()
140939	28-Jan-2005	phk	Make filesystems get rid of their own vnodes vnode_pager object in VOP_RECLAIM().
140936	28-Jan-2005	phk	Remove unused argument to vrecycle()
140783	25-Jan-2005	phk	Take VOP_GETVOBJECT() out to pasture. We use the direct pointer now.
140780	24-Jan-2005	phk	Don't implement vop_createvobject(), vop_open() and vop_close() manages this for nullfs now.
140776	24-Jan-2005	phk	Add null_open() and null_close() which calls null_bypass() and managed the v_object pointer.
140734	24-Jan-2005	phk	Kill the VV_OBJBUF and test the v_object for NULL instead.
140732	24-Jan-2005	phk	Remove "register" keywords.
140728	24-Jan-2005	phk	Style: Remove the commented out vop_foo_args replicas.
140165	13-Jan-2005	phk	Change the generated VOP_ macro implementations to improve type checking and KASSERT coverage. After this check there is only one "nasty" cast in this code but there is a KASSERT to protect against the wrong argument structure behind that cast. Un-inlining the meat of VOP_FOO() saves 35kB of text segment on a typical kernel with no change in performance. We also now run the checking and tracing on VOP's which have been layered by nullfs, umapfs, deadfs or unionfs. Add new (non-inline) VOP_FOO_AP() functions which take a "struct foo_args" argument and does everything the VOP_FOO() macros used to do with checks and debugging code. Add KASSERT to VOP_FOO_AP() check for argument type being correct. Slim down VOP_FOO() inline functions to just stuff arguments into the struct foo_args and call VOP_FOO_AP(). Put function pointer to VOP_FOO_AP() into vop_foo_desc structure and make VCALL() use it instead of the current offsetoff() hack. Retire vcall() which implemented the offsetoff() Make deadfs and unionfs use VOP_FOO_AP() calls instead of VCALL(), we know which specific call we want already. Remove unneeded arguments to VCALL() in nullfs and umapfs bypass functions. Remove unused vdesc_offset and VOFFSET(). Generally improve style/readability of the generated code.
140048	11-Jan-2005	phk	Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC(). I'm not sure why a credential was added to these in the first place, it is not used anywhere and it doesn't make much sense: The credentials for syncing a file (ability to write to the file) should be checked at the system call level. Credentials for syncing one or more filesystems ("none") should be checked at the system call level as well. If the filesystem implementation needs a particular credential to carry out the syncing it would logically have to the cached mount credential, or a credential cached along with any delayed write data. Discussed with: rwatson
139984	10-Jan-2005	phk	whitespace
139776	06-Jan-2005	imp	/* -> /*- for copyright notices, minor format tweaks as necessary
138483	06-Dec-2004	phk	Use vfs_mountedfrom(), rely on vfs_mount.c calling VFS_STATFS().
138412	05-Dec-2004	phk	VFS_STATFS(mp, ...) is mostly called with &mp->mnt_stat, but a few cases doesn't. Most of the implementations have grown weeds for this so they copy some fields from mnt_stat if the passed argument isn't that. Fix this the cleaner way: Always call the implementation on mnt_stat and copy that in toto to the VFS_STATFS argument if different.
138290	01-Dec-2004	phk	Back when VOP_* was introduced, we did not have new-style struct initializations but we did have lofty goals and big ideals. Adjust to more contemporary circumstances and gain type checking. Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place. Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc. Give coda correct prototypes and function definitions for all vop_()s. Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods. Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector. Remove a lot of vfs_init since vop_vector is ready to use from the compiler. Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc. Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse. Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)
138270	01-Dec-2004	phk	Mechanically change prototypes for vnode operations to use the new typedefs.
138105	26-Nov-2004	phk	Eliminate null_open() and use instead null_bypass(). Null_open() was only here to handle MNT_NODEV, but since that does not affect any filesystems anymore, it could only have any effect if you nullfs mounted a devfs but didn't want devices to show up. If you need that, there are easier ways.
138075	25-Nov-2004	phk	Use system wide no-op vfs_start function.
137479	09-Nov-2004	phk	Refuse attempts to mount root filesystem
132902	30-Jul-2004	phk	Put a version element in the VFS filesystem configuration structure and refuse initializing filesystems with a wrong version. This will aid maintenance activites on the 5-stable branch. s/vfs_mount/vfs_omount/ s/vfs_nmount/vfs_mount/ Name our filesystems mount function consistently. Eliminate the namiedata argument to both vfs_mount and vfs_omount. It was originally there to save stack space. A few places abused it to get hold of some credentials to pass around. Effectively it is unused. Reorganize the root filesystem selection code.
132023	12-Jul-2004	alfred	Make VFS_ROOT() and vflush() take a thread argument. This is to allow filesystems to decide based on the passed thread which vnode to return. Several filesystems used curthread, they now use the passed thread.
131923	10-Jul-2004	marcel	Update for the KDB framework: o Call kdb_enter() instead of Debugger(). o Make debugging code conditional upon KDB instead of DDB.
128019	07-Apr-2004	imp	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson
124404	11-Jan-2004	truckman	Don't try to unlock the directory vnode in null_lookup() if the lock is shared with the underlying file system and the lookup in the underlying file system did the unlock for us.
123932	28-Dec-2003	bde	v_vxproc was a bogus name for a thread (pointer).
116469	17-Jun-2003	tjr	MFp4: Fix two bugs causing possible deadlocks or panics, and one nit: - Emulate lock draining (LK_DRAIN) in null_lock() to avoid deadlocks when the vnode is being recycled. - Don't allow null_nodeget() to return a nullfs vnode from the wrong mount when multiple nullfs's are mounted. It's unclear why these checks were removed in null_subr.c 1.35, but they are definitely necessary. Without the checks, trying to unmount a nullfs mount will erroneously return EBUSY, and forcibly unmounting with -f will cause a panic. - Bump LOG2_SIZEVNODE up to 8, since vnodes are >256 bytes now. The old value (7) didn't cause any problems, but made the hash algorithm suboptimal. These changes fix nullfs enough that a parallel buildworld succeeds. Submitted by: tegge (partially; LK_DRAIN) Tested by: kris
116271	12-Jun-2003	phk	Initialize struct vfsops C99-sparsely. Submitted by: hmp Reviewed by: phk
115486	31-May-2003	phk	Use temporary variable to avoid double expansion of macro with side effects. Found by: FlexeLint
111841	03-Mar-2003	njl	Finish cleanup of vprint() which was begun with changing v_tag to a string. Remove extraneous uses of vop_null, instead defering to the default op. Rename vnode type "vfs" to the more descriptive "syncer". Fix formatting for various filesystems that use vop_print.
111119	19-Feb-2003	imp	Back out M_* changes, per decision of the TRB. Approved by: trb
109623	21-Jan-2003	alfred	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
108470	30-Dec-2002	schweikh	Fix typos, mostly s/ an / a / where appropriate and a few s/an/and/ Add FreeBSD Id tag where missing.
105211	16-Oct-2002	phk	Be consistent about functions being static. Spotted by: FlexeLint
105077	14-Oct-2002	mckusick	Regularize the vop_stdlock'ing protocol across all the filesystems that use it. Specifically, vop_stdlock uses the lock pointed to by vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to reference vp->v_lock. Filesystems that wish to use the default do not need to allocate a lock at the front of their node structure (as some still did) or do a lockinit. They can simply start using vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks, but still use the vop_stdlock functions (such as nullfs) can simply replace vp->v_vnlock with a pointer to the lock that they wish to have used for the vnode. Such filesystems are responsible for setting the vp->v_vnlock back to the default in their vop_reclaim routine (e.g., vp->v_vnlock = &vp->v_lock). In theory, this set of changes cleans up the existing filesystem lock interface and should have no function change to the existing locking scheme. Sponsored by: DARPA & NAI Labs.
103936	25-Sep-2002	jeff	- Use vrefcnt() where it is safe to do so instead of doing direct and unlocked accesses to v_usecount. - Lock access to the buf lists in the various sync routines. interlock locking could be avoided almost entirely in leaf filesystems if the fsync function had a generic helper.
103934	25-Sep-2002	jeff	- Hold the vp lock while accessing v_vflags.
103314	14-Sep-2002	njl	Remove all use of vnode->v_tag, replacing with appropriate substitutes. v_tag is now const char * and should only be used for debugging. Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP. Suggested by: phk Reviewed by: bde, rwatson (earlier version)
101308	04-Aug-2002	jeff	- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
98183	13-Jun-2002	semenu	Fix a race during null node creation between relookuping the hash and adding vnode to hash. The fix is to use atomic hash-lookup-and-add-if- not-found operation. The odd thing is that this race can't happen actually because the lowervp vnode is locked exclusively now during the whole process of null node creation. This must be thought as a step toward shared lookups. Also remove vp->v_mount checks when looking for a match in the hash, as this is the vestige. Also add comments and cosmetic changes.
98177	13-Jun-2002	semenu	Change null_hashlock into null_hashmtx, because there is no need for lockmgr and this helps to vget() vnode from hash without a race. Reviewed by: bp MFC after: 2 weeks
98176	13-Jun-2002	semenu	Fix the "error" path (when dropping not fully initialized vnode). Also move hash operations out of null_vnops.c and explicitly initialize v_lock in null_node_alloc (to set wmesg). Reviewed by: bp MFC after: 2 weeks
98175	13-Jun-2002	semenu	Fix wrong locking in null_inactive and null_reclaim. This makes nullfs relatively working back. Reviewed by: mckusick, bp
97186	23-May-2002	mux	Convert nullfs to nmount.
97072	21-May-2002	semenu	Fix null_lock() not unlocking vp->v_interlock if LK_THISLAYER. Reviewed by: bp@FreeBSD.org MFC after: 1 week
96755	16-May-2002	trhodes	More s/file system/filesystem/g
92540	18-Mar-2002	mckusick	Cannot release vnode underlying the nullfs vnode in null_inactive as it leaves the nullfs vnode allocated, but with no identity. The effect is that a null mount can slowly accumulate all the vnodes in the system, reclaiming them only when it is unmounted. Thus the null_inactive state instead accelerates the release of the null vnode by calling vrecycle which will in turn call the null_reclaim operator. The null_reclaim routine then does the freeing actions previosuly (incorrectly) done in null_inactive.
92462	17-Mar-2002	mckusick	Add a flags parameter to VFS_VGET to pass through the desired locking flags when acquiring a vnode. The immediate purpose is to allow polling lock requests (LK_NOWAIT) needed by soft updates to avoid deadlock when enlisting other processes to help with the background cleanup. For the future it will allow the use of shared locks for read access to vnodes. This change touches a lot of files as it affects most filesystems within the system. It has been well tested on FFS, loopback, and CD-ROM filesystems. only lightly on the others, so if you find a problem there, please let me (mckusick@mckusick.com) know.
83366	12-Sep-2001	julian	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
78179	13-Jun-2001	mjacob	the last argument to copyinstr is of t ype size_t, not u_int
77130	24-May-2001	ru	mount_null(8) -> mount_nullfs(8).
77031	23-May-2001	ru	- FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file systems were repo-copied from sys/miscfs to sys/fs. - Renamed the following file systems and their modules: fdesc -> fdescfs, portal -> portalfs, union -> unionfs. - Renamed corresponding kernel options: FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS. - Install header files for the above file systems. - Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland Makefiles.
76688	16-May-2001	iedowse	Change the second argument of vflush() to an integer that specifies the number of references on the filesystem root vnode to be both expected and released. Many filesystems hold an extra reference on the filesystem root vnode, which must be accounted for when determining if the filesystem is busy and then released if it isn't busy. The old `skipvp' approach required individual filesystem xxx_unmount functions to re-implement much of vflush()'s logic to deal with the root vnode. All 9 filesystems that hold an extra reference on the root vnode got the logic wrong in the case of forced unmounts, so `umount -f' would always fail if there were any extra root vnode references. Fix this issue centrally in vflush(), now that we can. This commit also fixes a vnode reference leak in devfs, which could result in idle devfs filesystems that refuse to unmount. Reviewed by: phk, bp
76166	01-May-2001	markm	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
74273	15-Mar-2001	rwatson	o Change the API and ABI of the Extended Attribute kernel interfaces to introduce a new argument, "namespace", rather than relying on a first- character namespace indicator. This is in line with more recent thinking on EA interfaces on various mailing lists, including the posix1e, Linux acl-devel, and trustedbsd-discuss forums. Two namespaces are defined by default, EXTATTR_NAMESPACE_SYSTEM and EXTATTR_NAMESPACE_USER, where the primary distinction lies in the access control model: user EAs are accessible based on the normal MAC and DAC file/directory protections, and system attributes are limited to kernel-originated or appropriately privileged userland requests. o These API changes occur at several levels: the namespace argument is introduced in the extattr_{get,set}_file() system call interfaces, at the vnode operation level in the vop_{get,set}extattr() interfaces, and in the UFS extended attribute implementation. Changes are also introduced in the VFS extattrctl() interface (system call, VFS, and UFS implementation), where the arguments are modified to include a namespace field, as well as modified to advoid direct access to userspace variables from below the VFS layer (in the style of recent changes to mount by adrian@FreeBSD.org). This required some cleanup and bug fixing regarding VFS locks and the VFS interface, as a vnode pointer may now be optionally submitted to the VFS_EXTATTRCTL() call. Updated documentation for the VFS interface will be committed shortly. o In the near future, the auto-starting feature will be updated to search two sub-directories to the ".attribute" directory in appropriate file systems: "user" and "system" to locate attributes intended for those namespaces, as the single filename is no longer sufficient to indicate what namespace the attribute is intended for. Until this is committed, all attributes auto-started by UFS will be placed in the EXTATTR_NAMESPACE_SYSTEM namespace. o The default POSIX.1e attribute names for ACLs and Capabilities have been updated to no longer include the '$' in their filename. As such, if you're using these features, you'll need to rename the attribute backing files to the same names without '$' symbols in front. o Note that these changes will require changes in userland, which will be committed shortly. These include modifications to the extended attribute utilities, as well as to libutil for new namespace string conversion routines. Once the matching userland changes are committed, a buildworld is recommended to update all the necessary include files and verify that the kernel and userland environments are in sync. Note: If you do not use extended attributes (most people won't), upgrading is not imperative although since the system call API has changed, the new userland extended attribute code will no longer compile with old include files. o Couple of minor cleanups while I'm there: make more code compilation conditional on FFS_EXTATTR, which should recover a bit of space on kernels running without EA's, as well as update copyright dates. Obtained from: TrustedBSD Project
73286	01-Mar-2001	adrian	Reviewed by: jlemon An initial tidyup of the mount() syscall and VFS mount code. This code replaces the earlier work done by jlemon in an attempt to make linux_mount() work. * the guts of the mount work has been moved into vfs_mount(). * move `type', `path' and `flags' from being userland variables into being kernel variables in vfs_mount(). `data' remains a pointer into userspace. * Attempt to verify the `type' and `path' strings passed to vfs_mount() aren't too long. * rework mount() and linux_mount() to take the userland parameters (besides data, as mentioned) and pass kernel variables to vfs_mount(). (linux_mount() already did this, I've just tidied it up a little more.) * remove the copyin() stuff for `path'. `data' still requires copyin() since its a pointer into userland. * set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each filesystem. This variable is generally initialised with `path', and each filesystem can override it if they want to. * NOTE: f_mntonname is intiailised with "/" in the case of a root mount.
72200	09-Feb-2001	bmilekic	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
71999	04-Feb-2001	phk	Mechanical change to use <sys/queue.h> macro API instead of fondling implementation details. Created with: sed(1) Reviewed by: md5(1)
67882	29-Oct-2000	phk	Remove unneeded #include <sys/proc.h> lines.
67441	22-Oct-2000	bp	Rev 1.41 was committed from wrong diff, now do it right.
67439	22-Oct-2000	bp	Release and unlock vnode if resource deadlock detected.
67145	15-Oct-2000	bp	Fix nullfs breakage caused by incomplete migration of v_interlock from simple_lock to mutex. Reset LK_INTERLOCK flag when interlock released manually.
66615	04-Oct-2000	jasone	Convert lockmgr locks from using simple locks to using mutexes. Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
66570	03-Oct-2000	bp	Prevent dereference of NULL pointer when null_lock() and null_unlock() called and there is no underlying vnode.
66356	25-Sep-2000	bp	Fix vnode locking bugs in the nullfs. Add correct support for v_object management, so mmap() operation should work properly. Add support for extattrctl() routine (submitted by semenu). At this point nullfs can be considered as functional and much more stable. In fact, it should behave as a "hard" "symlink" to underlying filesystem. Reviewed in general by: mckusick, dillon Parts of logic obtained from: NetBSD
65467	05-Sep-2000	bp	Various cleanups towards make nullfs functional (it is still broken at this point): Replace all '#ifdef DEBUG' with '#ifdef NULLFS_DEBUG' and add NULLFSDEBUG macro. Protect nullfs hash table with lockmgr. Use proper order of operations when freeing mnt_data. Return correct fsid in the null_getattr(). Add null_open() function to catch MNT_NODEV (obtained from NetBSD). Add null_rename() to catch cross-fs rename operations (submitted by Ustimenko Semen <semen@iclub.nsu.ru>) Remove duplicate $FreeBSD$ tags.
65464	05-Sep-2000	bp	Get rid from the __P() macros. Encouraged by: peter
63962	28-Jul-2000	sheldonh	Rename the loadable nullfs kernel module: null -> nullfs
60938	26-May-2000	jake	Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others
60833	23-May-2000	jake	Change the way that the queue(3) structures are declared; don't assume that the type argument to _HEAD and _ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd
59794	30-Apr-2000	phk	Remove unneeded #include <vm/vm_zone.h> Generated by: src/tools/tools/kerninclude
59368	18-Apr-2000	phk	Remove unneeded <sys/buf.h> includes. Due to some interesting cpp tricks in lockmgr, the LINT kernel shrinks by 924 bytes.
56272	19-Jan-2000	rwatson	Fix bde'isms in acl/extattr syscall interface, renaming syscalls to prettier (?) names, adding some const's around here, et al. Reviewed by: bde
55206	29-Dec-1999	peter	Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.
54803	19-Dec-1999	rwatson	Second pass commit to introduce new ACL and Extended Attribute system calls, vnops, vfsops, both in /kern, and to individual file systems that require a vfsop_ array entry. Reviewed by: eivind
54655	15-Dec-1999	eivind	Introduce NDFREE (and remove VOP_ABORTOP)
54444	11-Dec-1999	eivind	Lock reporting and assertion changes. * lockstatus() and VOP_ISLOCKED() gets a new process argument and a new return value: LK_EXCLOTHER, when the lock is held exclusively by another process. * The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them * Extend the vnode_if.src format to allow more exact specification than locked/unlocked. This commit should not do any semantic changes unless you are using DEBUG_VFS_LOCKS. Discussed with: grog, mch, peter, phk Reviewed by: peter
51138	11-Sep-1999	alfred	Seperate the export check in VFS_FHTOVP, exports are now checked via VFS_CHECKEXP. Add fh(open\|stat\|stafs) syscalls to allow userland to query filesystems based on (network) filehandle. Obtained from: NetBSD
50890	04-Sep-1999	bde	Get rid of the NULLFS_DIAGNOSTIC option. This option was as useful as the other XXXFS_DIAGNOSTIC options (not very) and mostly controlled tracing of normal operation. Use `#ifdef DEBUG' for non-diagnostics and `#ifdef DIAGNOSTIC' for diagnostics.
50616	30-Aug-1999	bde	Converted the silly SAFTEY option into a new-style option by renaming it to DIAGNOSTIC. Fixed an English style bug in the panic messages controlled by SAFETY.
50477	28-Aug-1999	peter	$Id$ -> $FreeBSD$
48468	02-Jul-1999	phk	Make sure that stat(2) and friends always return a valid st_dev field. Pseudo-FS need not fill in the va_fsid anymore, the syscall code will use the first half of the fsid, which now looks like a udev_t with major 255.
47964	16-Jun-1999	mckusick	Add a vnode argument to VOP_BWRITE to get rid of the last vnode operator special case. Delete special case code from vnode_if.sh, vnode_if.src, umap_vnops.c, and null_vnops.c.
43311	28-Jan-1999	dillon	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
43305	27-Jan-1999	dillon	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
41591	07-Dec-1998	archie	The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.
38909	07-Sep-1998	bde	Removed statically configured mount type numbers (MOUNT_) and all references to them. The change a couple of days ago to ignore these numbers in statically configured vfsconf structs was slightly premature because the cd9660, cfs, devfs, ext2fs, nfs vfs's still used MOUNT_ instead of the number in their vfsconf struct.
37977	30-Jul-1998	bde	Fixed printf format errors.
37649	15-Jul-1998	bde	Cast pointers to uintptr_t/intptr_t instead of to u_long/long, respectively. Most of the longs should probably have been u_longs, but this changes is just to prevent warnings about casts between pointers and integers of different sizes, not to fix poorly chosen types.
37384	04-Jul-1998	julian	VOP_STRATEGY grows an (struct vnode *) argument as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>
36840	10-Jun-1998	peter	Don't silently accept attempts to change flags where they are not supported.
35769	06-May-1998	msmith	As described by the submitter: Reverse the VFS_VRELE patch. Reference counting of vnodes does not need to be done per-fs. I noticed this while fixing vfs layering violations. Doing reference counting in generic code is also the preference cited by John Heidemann in recent discussions with him. The implementation of alternative vnode management per-fs is still a valid requirement for some filesystems but will be revisited sometime later, most likely using a different framework. Submitted by: Michael Hancock <michaelh@cet.co.jp>
35256	17-Apr-1998	des	Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108.
33964	01-Mar-1998	msmith	The intent is to get rid of WILLRELE in vnode_if.src by making a complement to all ops that return a vpp, VFS_VRELE. This is initially only for file systems that implement the following ops that do a WILLRELE: vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link, vop_rename, vop_mkdir, vop_rmdir, vop_symlink This is initial DNA that doesn't do anything yet. VFS_VRELE is implemented but not called. A default vfs_vrele was created for fs implementations that use the standard vnode management routines. VFS_VRELE implementations were made for the following file systems: Standard (vfs_vrele) ffs mfs nfs msdosfs devfs ext2fs Custom union umapfs Just EOPNOTSUPP fdesc procfs kernfs portal cd9660 These implementations may change as VOP changes are implemented. In the next phase, in the vop implementations calls to vrele and the vrele part of vput will be moved to the top layer vfs_vnops and made visible to all layers. vput will be replaced by unlock in these cases. Unlocking will still be done in the per fs layer but the refcount decrement will be triggered at the top because it doesn't hurt to hold a vnode reference a little longer. This will have minimal impact on the structure of the existing code. This will only be done for vnode arguments that are released by the various fs vop implementations. Wider use of VFS_VRELE will likely require restructuring of the code. Reviewed by: phk, dyson, terry et. al. Submitted by: Michael Hancock <michaelh@cet.co.jp>
33181	09-Feb-1998	eivind	Staticize.
33134	06-Feb-1998	eivind	Back out DIAGNOSTIC changes.
33108	04-Feb-1998	eivind	Turn DIAGNOSTIC into a new-style option.
32929	31-Jan-1998	eivind	Make the debug options new-style. This also zaps a DPT option from lint; it wasn't referenced from anywhere.
32150	01-Jan-1998	bde	Fixed missing initialization of mp->mnt_stat. At least vm depends on at least mp->mnt_stat.f_iosize being nonzero. PR: 5212
30636	21-Oct-1997	roberto	Fix the file leak bug. The lower layer wasn't informed the vnode was inactive and kept a reference, preventing the blocks to be reclaimed. Changed the comment in null_inactive to reflect the current situation. Reviewed by: phk
30434	15-Oct-1997	phk	Hmm, realign the vnops into two columns.
30431	15-Oct-1997	phk	Stylistic overhaul of vnops tables. 1. Remove comment stating the blatantly obvious. 2. Align in two columns. 3. Sort all but the default element alphabetically. 4. Remove XXX comments pointing out entries not needed.
30354	12-Oct-1997	phk	Last major round (Unless Bruce thinks of somthing :-) of malloc changes. Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde
29584	18-Sep-1997	phk	Executing binaries on a nullfs (or nullfs-based) filesystem results in a trap. PR: 3104 Reviewed by: phk Submitted by: Dan Walters hannibal@cyberstation.net
29179	07-Sep-1997	bde	Some staticized variables were still declared to be extern.
28844	28-Aug-1997	kato	Include "opt_ddb.h" only when NULLFS_DIAGNOSTIC is defined.
28832	27-Aug-1997	kato	Fixed NULLFS_DIAGNOSTIC stuff.
28270	16-Aug-1997	wollman	Fix all areas of the system (or at least all those in LINT) to avoid storing socket addresses in mbufs. (Socket buffers are the one exception.) A number of kernel APIs needed to get fixed in order to make this happen. Also, fix three protocol families which kept PCBs in mbufs to not malloc them instead. Delete some old compatibility cruft while we're at it, and add some new routines in the in_cksum family.
27845	02-Aug-1997	bde	Removed unused #includes.
26964	26-Jun-1997	alex	More comment cleanup.
26963	26-Jun-1997	alex	Typo police.
26111	25-May-1997	peter	Fix some warnings (missing prototypes, wrong "generic" args etc) umapfs uses one of nullfs's functions...
25016	19-Apr-1997	kato	Avoid `lock against myself' panic by following operation: # mount -t union (or null) dir1 dir2 # mount -t union (or null) dir2 dir1 The function namei in union_mount calls union_root. The upper vnode has been already locked and vn_lock in union_root causes above panic. Add printf's included in `#ifdef DIAGNOSTIC' for EDEADLK cases.
24988	17-Apr-1997	kato	Fix `locking against myself' panic by multi nullfs mount of same directory pair.
24987	17-Apr-1997	kato	Use NULLVP instead of NULL.
22975	22-Feb-1997	peter	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
22607	12-Feb-1997	mpp	Eliminate the last of the compile warnings in this module by correctly casting the arguments to all of the null_bypass() calls.
22605	12-Feb-1997	mpp	Restore of #include <sys/kernel.h> so that this compiles without warnings again.
22597	12-Feb-1997	mpp	Make this compile again after the Lite2 merge. Also add missing function prototypes.
22521	10-Feb-1997	dyson	This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
21673	14-Jan-1997	jkh	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
16312	12-Jun-1996	dg	Moved the fsnode MALLOC to before the call to getnewvnode() so that the process won't possibly block before filling in the fsnode pointer (v_data) which might be dereferenced during a sync since the vnode is put on the mnt_vnodelist by getnewvnode. Pointed out by Matt Day <mday@artisoft.com>
12769	11-Dec-1995	phk	Staticize.
12595	03-Dec-1995	bde	Added prototypes. Removed some unnecessary #includes.
12594	03-Dec-1995	bde	null_node_find() and umap_node_find() were sometimes called without a `struct mount *' arg. I don't know what the effects of this were.
12158	09-Nov-1995	bde	Introduced a type `vop_t' for vnode operation functions and used it 1138 times (:-() in casts and a few more times in declarations. This change is null for the i386. The type has to be `typedef int vop_t(void *)' and not `typedef int vop_t()' because `gcc -Wstrict-prototypes' warns about the latter. Since vnode op functions are called with args of different (struct pointer) types, neither of these function types is any use for type checking of the arg, so it would be preferable not to use the complete function type, especially since using the complete type requires adding 1138 casts to avoid compiler warnings and another 40+ casts to reverse the function pointer conversions before calling the functions.
8876	30-May-1995	rgrimes	Remove trailing whitespace.
7170	19-Mar-1995	dg	Removed redundant newlines that were in some panic strings.
7095	16-Mar-1995	wollman	Add four more filesystem flags: VFCF_NETWORK (this FS goes over the net) VFCF_READONLY (read-write mounts do not make any sense) VFCF_SYNTHETIC (data in this FS is not real) VFCF_LOOPBACK (this FS aliases something else) cd9660 is readonly; nullfs, umapfs, and union are loopback; NFS is netowkr; procfs, kernfs, and fdesc are synthetic.
7090	16-Mar-1995	bde	Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
3496	10-Oct-1994	phk	Cosmetics. reduce the noise from gcc -Wall.
3311	02-Oct-1994	phk	GCC cleanup. Reviewed by: Submitted by: Obtained from:
2979	22-Sep-1994	wollman	More loadable VFS changes: - Make a number of filesystems work again when they are statically compiled (blush) - FIFOs are no longer optional; ``options FIFO'' removed from distributed config files.
2960	21-Sep-1994	wollman	Fix a few niggling little bugs: - set args->lkm_offset correctly so that VFS modules can be unloaded - initialize _fs_vfsops.vfc_refcount correctly so that VFS modules can be unloaded - include kernel.h in a few placves to get the correct definition of DATA_SET
2946	21-Sep-1994	wollman	Implemented loadable VFS modules, and made most existing filesystems loadable. (NFS is a notable exception.)
2142	20-Aug-1994	dg	1) cleaned up after Garrett - fixed more redundant declarations, changed use of timeout_t -> timeout_func_t in aha1542 and aha1742 drivers. 2) fix a bug in the portalfs that was uncovered by better prototyping - specifically, the time must be converted from timeval to timespec before storing in va_atime. 3) fixed/added some miscellaneous prototypes
1817	02-Aug-1994	dg	Added $Id$
1549	25-May-1994	rgrimes	The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
1541	24-May-1994	rgrimes	BSD 4.4 Lite Kernel Sources