History log of /freebsd-9.3-release/sys/fs/tmpfs/
Revision Date Author Comments
267654 20-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


263944 30-Mar-2014 bdrewery

MFC r263130:

Fix -o size less than PAGE_SIZE resulting in SIZE_MAX being used.


254550 20-Aug-2013 kib

MFC r253967:
Wait for the doomed vnode to be detached from the tmpfs
node if sleepable allocation is requested.


248678 24-Mar-2013 kib

MFC r248422:
Remove negative name cache entry pointing to the target name, which
could be instantiated while tdvp was unlocked.


242368 30-Oct-2012 alfred

MFC: Allow NFS exports of tmpfs. r234346

Approved by: kevlo, kib


240238 08-Sep-2012 kib

MFC r239065:
Stop including vm_param.h into vm_page.h. Include vm_param.h
explicitely for the kernel code which needs it.


236209 28-May-2012 alc

MFC r230120
Neither tmpfs_nocacheread() nor tmpfs_mappedwrite() needs to call
vm_object_pip_{add,subtract}() on the swap object because the swap
object can't be destroyed while the vnode is exclusively locked.
Moreover, even if the swap object could have been destroyed during
tmpfs_nocacheread() and tmpfs_mappedwrite() this code is broken
because vm_object_pip_subtract() does not wake up the sleeping thread
that is trying to destroy the swap object.

Free invalid pages after an I/O error. There is no virtue in keeping
them around in the swap object creating more work for the page daemon.
(I believe that any non-busy page in the swap object will now always
be valid.)

vm_pager_get_pages() does not return a standard errno, so its return
value should not be returned by tmpfs without translation to an errno
value.

There is no reason for the wakeup on vpg in tmpfs_mappedwrite() to
occur with the swap object locked.

Eliminate printf()s from tmpfs_nocacheread() and tmpfs_mappedwrite().
(The swap pager already spams your console if data corruption is
imminent.)


236208 28-May-2012 alc

MFC r229821
Correct an error of omission in the implementation of the truncation
operation on POSIX shared memory objects and tmpfs. Previously, neither of
these modules correctly handled the case in which the new size of the object
or file was not a multiple of the page size. Specifically, they did not
handle partial page truncation of data stored on swap. As a result, stale
data might later be returned to an application.

Interestingly, a data inconsistency was less likely to occur under tmpfs
than POSIX shared memory objects. The reason being that a different mistake
by the tmpfs truncation operation helped avoid a data inconsistency. If the
data was still resident in memory in a PG_CACHED page, then the tmpfs
truncation operation would reactivate that page, zero the truncated portion,
and leave the page pinned in memory. More precisely, the benevolent error
was that the truncation operation didn't add the reactivated page to any of
the paging queues, effectively pinning the page. This page would remain
pinned until the file was destroyed or the page was read or written. With
this change, the page is now added to the inactive queue.

MFC r230180
When tmpfs_write() resets an extended file to its original size after an
error, we want tmpfs_reg_resize() to ignore I/O errors and unconditionally
update the file's size.


234849 30-Apr-2012 gleb

MFC r233998-r234000 and r234325:

r233998:
Add reserved memory limit sysctl to tmpfs. Cleanup availble and used
memory functions. Check if free pages available before allocating new
node.

r233999 (partial):
Add vfs_getopt_size. Support human readable file system options in tmpfs.
Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs.

NOTE: To preserve KBI add tmpfs_getopt_size function instead of global
vfs_getopt_size.

r234000:
tmpfs supports only INT_MAX nodes due to limitations of unit number
allocator. Replace UINT32_MAX checks with INT_MAX. Keeping more than 2^31
nodes in memory is not likely to become possible in foreseeable feature
and would require new unit number allocator.

r234325:
Provide better description for vfs.tmpfs.memory_reserved sysctl.


234511 20-Apr-2012 rmh

MFC of r227310:

Don astbestos garment and remove the warning about TMPFS being experimental
-- highly experimental even. So far the closest to a bug in TMPFS that people
have gotten to relates to how ZFS can take away from the memory that TMPFS
needs. One can argue that such is not a bug in TMPFS. Irrespective, even if
there is a bug here and there in TMPFS, it's not in our own advantage to
scare people away from using TMPFS. I for one have been using it, even with
ZFS, very successfully.

Reviewed by: marcel


233851 03-Apr-2012 gleb

MFC r232959 and r232960:

Prevent tmpfs_rename() deadlock in a way similar to UFS. Unlock
vnodes and try to lock them one by one. Relookup fvp and tvp.

Don't enforce LK_RETRY to get existing vnode in tmpfs_alloc_vp().
Doomed vnode is hardly of any use here, besides all callers handle
error case. vfs_hash_get() does the same. Don't mess with vnode
holdcount, vget() takes care of it already.

Approved by: mdf (mentor)


233769 02-Apr-2012 delphij

MFC r227802:

Improve the way to calculate available pages in tmpfs:

- Don't deduct wired pages from total usable counts because it does not
make any sense. To make things worse, on systems where swap size is
smaller than physical memory and use a lot of wired pages (e.g. ZFS),
tmpfs can suddenly have free space of 0 because of this;
- Count cached pages as available; [1]
- Don't count inactive pages as available, technically we could but that
might be too aggressive; [1]

[1] Suggested by kib@


233385 23-Mar-2012 jhb

MFC 232401:
Similar to the fixes in 226967 and 226987, purge any name cache entries
associated with the previous vnode (if any) associated with the target of
a rename(). Otherwise, a lookup of the target pathname concurrent with a
rename() could re-add a name cache entry after the namei(RENAME) lookup
in kern_renameat() had purged the target pathname.


231775 15-Feb-2012 alc

MFC r229363
Don't pass VM_ALLOC_ZERO to vm_page_grab() in tmpfs_mappedwrite() and
tmpfs_nocacheread(). It is both unnecessary and a pessimization. It
results in either the page being zeroed twice or zeroed first and then
overwritten by an I/O operation.


229855 09-Jan-2012 ivoras

MFC r227822: Avoid panics from recursive rename operations.

PR: kern/159418


229130 31-Dec-2011 pho

MFC: r226987

Added missing cache purge of from argument for rename().


225736 23-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


225418 06-Sep-2011 kib

Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic
flags field. Updates to the atomic flags are performed using the atomic
ops on the containing word, do not require any vm lock to be held, and
are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9)
functions are provided to modify afalgs.

Document the changes to flags field to only require the page lock.

Introduce vm_page_reference(9) function to provide a stable KPI and
KBI for filesystems like tmpfs and zfs which need to mark a page as
referenced.

Reviewed by: alc, attilio
Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64)
Approved by: re (bz)


223677 29-Jun-2011 alc

Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this
option to vm_object_page_remove() asserts that the specified range of pages
is not mapped, or more precisely that none of these pages have any managed
mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on
the pages.

This change not only saves time by eliminating pointless calls to
pmap_remove_all(), but it also eliminates an inconsistency in the use of
pmap_remove_all() versus related functions, like pmap_remove_write(). It
eliminates harmless but pointless calls to pmap_remove_all() that were being
performed on PG_UNMANAGED pages.

Update all of the existing assertions on pmap_remove_all() to reflect this
change.

Reviewed by: kib


222167 22-May-2011 rmacklem

Add a lock flags argument to the VFS_FHTOVP() file system
method, so that callers can indicate the minimum vnode
locking requirement. This will allow some file systems to choose
to return a LK_SHARED locked vnode when LK_SHARED is specified
for the flags argument. This patch only adds the flag. It
does not change any file system to use it and all callers
specify LK_EXCLUSIVE, so file system semantics are not changed.

Reviewed by: kib


218949 22-Feb-2011 alc

Eliminate two dubious attempts at optimizing the implementation of a
file's last accessed, modified, and changed times:

TMPFS_NODE_ACCESSED and TMPFS_NODE_CHANGED should be set unconditionally
in tmpfs_remove() without regard to the number of hard links to the file.
Otherwise, after the last directory entry for a file has been removed, a
process that still has the file open could read stale values for the last
accessed and changed times with fstat(2).

Similarly, tmpfs_close() should update the time-related fields even if all
directory entries for a file have been removed. In this case, the effect
is that the time-related fields will have values that are later than
expected. They will correspond to the time at which fstat(2) is called.

In collaboration with: kib
MFC after: 1 week


218863 19-Feb-2011 alc

tmpfs_remove() isn't modifying the file's data, so it shouldn't set
TMPFS_NODE_MODIFIED on the node.

PR: 152488
Submitted by: Anton Yuzhaninov
Reviewed by: kib
MFC after: 1 week


218681 14-Feb-2011 alc

Further simplify tmpfs_reg_resize(). Also, update its comments, including
style fixes.


218640 13-Feb-2011 alc

Eliminate tn_reg.tn_aobj_pages. Instead, correctly maintain the vm
object's size field. Previously, that field was always zero, even
when the object tn_reg.tn_aobj contained numerous pages.

Apply style fixes to tmpfs_reg_resize().

In collaboration with: kib


217633 20-Jan-2011 kib

In tmpfs_readdir(), normalize handling of the directory entries that
either overflow the supplied buffer, or cause uiomove fail.
Do not advance cached de when directory entry was not copied out.
Do not return EOF when no entries could be copied due to first entry
too large for supplied buffer, signal EINVAL instead.

Reported by: Beat G?tzi <beat chruetertee ch>
MFC after: 1 week


213735 12-Oct-2010 avg

tmpfs + sendfile: do not produce partially valid pages for vnode's tail

See r213730 for details of analogous change in ZFS.

MFC after: 3 days


212650 15-Sep-2010 avg

tmpfs, zfs + sendfile: mark page bits as valid after populating it with data

Otherwise, adding insult to injury, in addition to double-caching of data
we would always copy the data into a vnode's vm object page from backend.
This is specific to sendfile case only (VOP_READ with UIO_NOCOPY).

PR: kern/141305
Reported by: Wiktor Niesiobedzki <bsd@vink.pl>
Reviewed by: alc
Tested by: tools/regression/sockets/sendfile
MFC after: 2 weeks


212305 07-Sep-2010 ivoras

Avoid "Entry can disappear before we lock fdvp" panic.

PR: 150143
Submitted by: Gleb Kurtsou <gk at FreeBSD.org>
Pretty sure it won't blow up: mckusick
MFC after: 2 weeks


211598 22-Aug-2010 ed

Add support for whiteouts on tmpfs.

Right now unionfs only allows filesystems to be mounted on top of
another if it supports whiteouts. Even though I have sent a patch to
daichi@ to let unionfs work without it, we'd better also add support for
whiteouts to tmpfs.

This patch implements .vop_whiteout and makes necessary changes to
lookup() and readdir() to take them into account. We must also make sure
that when adding or removing a file, we honour the componentname's
DOWHITEOUT and ISWHITEOUT, to prevent duplicate filenames.

MFC after: 1 month


209226 16-Jun-2010 alc

Eliminate unnecessary page queues locking.


207719 06-May-2010 trasz

Style fixes and removal of unneeded variable.

Submitted by: bde@


207662 05-May-2010 trasz

Move checking against RLIMIT_FSIZE into one place, vn_rlimit_fsize().

Reviewed by: kib


207644 05-May-2010 alc

Push down the acquisition of the page queues lock into vm_page_unwire().

Update the comment describing which lock should be held on entry to
vm_page_wire().

Reviewed by: kib


207573 03-May-2010 alc

Acquire the page lock around vm_page_unwire() and vm_page_wire().

Reviewed by: kib


207530 02-May-2010 alc

It makes no sense for vm_page_sleep_if_busy()'s helper, vm_page_sleep(),
to unconditionally set PG_REFERENCED on a page before sleeping. In many
cases, it's perfectly ok for the page to disappear, i.e., be reclaimed by
the page daemon, before the caller to vm_page_sleep() is reawakened.
Instead, we now explicitly set PG_REFERENCED in those cases where having
the page persist until the caller is awakened is clearly desirable. Note,
however, that setting PG_REFERENCED on the page is still only a hint,
and not a guarantee that the page should persist.


203164 29-Jan-2010 jh

Add "maxfilesize" mount option for tmpfs to allow specifying the
maximum file size limit. Default is UINT64_MAX when the option is
not specified. It was useless to set the limit to the total amount of
memory and swap in the system.

Use tmpfs_mem_info() rather than get_swpgtotal() in tmpfs_mount() to
check if there is enough memory available.

Remove now unused get_swpgtotal().

Reviewed by: Gleb Kurtsou
Approved by: trasz (mentor)


202708 20-Jan-2010 jh

- Change the type of nodes_max to u_int and use "%u" format string to
convert its value. [1]
- Set default tm_nodes_max to min(pages + 3, UINT32_MAX). It's more
reasonable than the old four nodes per page (with page size 4096) because
non-empty regular files always use at least one page. This fixes possible
overflow in the calculation. [2]
- Don't allow more than tm_nodes_max nodes allocated in tmpfs_alloc_node().

PR: kern/138367
Suggested by: bde [1], Gleb Kurtsou [2]
Approved by: trasz (mentor)


202187 13-Jan-2010 jh

- Fix some style bugs in tmpfs_mount(). [1]
- Remove a stale comment about tmpfs_mem_info() 'total' argument.

Reported by: bde [1]


201773 08-Jan-2010 jh

- Change the type of size_max to u_quad_t because its value is converted
with vfs_scanopt(9) using the "%qu" format string.
- Limit the maximum value of size_max to (SIZE_MAX - PAGE_SIZE) to
prevent overflow in howmany() macro.

PR: kern/141194
Approved by: trasz (mentor)
MFC after: 2 weeks


198494 26-Oct-2009 alc

There is no need to "busy" a page when the object is locked for the duration
of the operation.


197953 11-Oct-2009 delphij

Add locking around access to parent node, and bail out when the parent
node is already freed rather than panicking the system.

PR: kern/122038
Submitted by: gk
Tested by: pho
MFC after: 1 week


197850 07-Oct-2009 delphij

Add a special workaround to handle UIO_NOCOPY case. This fixes data
corruption observed when sendfile() is being used.

PR: kern/127213
Submitted by: gk
MFC after: 2 weeks


197740 04-Oct-2009 delphij

Fix a bug that causes the fsx test case of mmap'ed page being out of sync
of read/write, inspired by ZFS's counterpart.

PR: kern/139312
Submitted by: gk@
MFC after: 1 week


194766 23-Jun-2009 kib

Implement global and per-uid accounting of the anonymous memory. Add
rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved
for the uid.

The accounting information (charge) is associated with either map entry,
or vm object backing the entry, assuming the object is the first one
in the shadow chain and entry does not require COW. Charge is moved
from entry to object on allocation of the object, e.g. during the mmap,
assuming the object is allocated, or on the first page fault on the
entry. It moves back to the entry on forks due to COW setup.

The per-entry granularity of accounting makes the charge process fair
for processes that change uid during lifetime, and decrements charge
for proper uid when region is unmapped.

The interface of vm_pager_allocate(9) is extended by adding struct ucred *,
that is used to charge appropriate uid when allocation if performed by
kernel, e.g. md(4).

Several syscalls, among them is fork(2), may now return ENOMEM when
global or per-uid limits are enforced.

In collaboration with: pho
Reviewed by: alc
Approved by: re (kensmith)


194124 13-Jun-2009 alc

Eliminate unnecessary variables.


192917 27-May-2009 alc

Eliminate redundant setting of a page's valid bits and pointless clearing
of the same page's dirty bits.


191990 11-May-2009 attilio

Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS. Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled. Bump __FreeBSD_version in order to signal such
situation.


188929 22-Feb-2009 alc

Use uiomove_fromphys() instead of the combination of sf_buf and uiomove().

This is not only shorter; it also eliminates unnecessary thread pinning on
architectures that implement a direct map.

MFC after: 3 weeks


188921 22-Feb-2009 alc

Simplify the unwiring and activation of pages.

MFC after: 1 week


188318 08-Feb-2009 kib

Lookup up the directory entry for the tmpfs node that are deleted by
both node pointer and name component. This does the right thing for
hardlinks to the same node in the same directory.

Submitted by: Yoshihiro Ota <ota j email ne jp>
PR: kern/131356
MFC after: 2 weeks


187959 31-Jan-2009 bz

Remove unused local variables.

Submitted by: Christoph Mallon christoph.mallon@gmx.de
Reviewed by: kib
MFC after: 2 weeks


184413 28-Oct-2008 trasz

Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.

Approved by: rwatson (mentor)


183299 23-Sep-2008 obrien

The kernel implemented 'memcmp' is an alias for 'bcmp'. However, memcmp
and bcmp are not the same thing. 'man bcmp' states that the return is
"non-zero" if the two byte strings are not identical. Where as,
'man memcmp' states that the return is the "difference between the
first two differing bytes (treated as unsigned char values" if the
two byte strings are not identical.

So provide a proper memcmp(9), but it is a C implementation not a tuned
assembly implementation. Therefore bcmp(9) should be preferred over memcmp(9).


183215 20-Sep-2008 kib

fdescfs, devfs, mqueuefs, nfs, portalfs, pseudofs, tmpfs and xfs
initialize the vattr structure in VOP_GETATTR() with VATTR_NULL(),
vattr_null() or by zeroing it. Remove these to allow preinitialization
of fields work in vn_stat(). This is needed to get birthtime initialized
correctly.

Submitted by: Jaakko Heinonen <jh saunalahti fi>
Discussed on: freebsd-fs
MFC after: 1 month


183214 20-Sep-2008 kib

Initialize va_rdev to NODEV instead of 0 or VNOVAL in VOP_GETATTR().
NODEV is more appropriate when va_rdev doesn't have a meaningful value.

Submitted by: Jaakko Heinonen <jh saunalahti fi>
Suggested by: bde
Discussed on: freebsd-fs
MFC after: 1 month


183212 20-Sep-2008 kib

Initialize va_flags and va_filerev properly in VOP_GETATTR(). Don't
initialize va_vaflags and va_spare because they are not part of the
VOP_GETATTR() API. Also don't initialize birthtime to ctime or zero.

Submitted by: Jaakko Heinonen <jh saunalahti fi>
Reviewed by: bde
Discussed on: freebsd-fs
MFC after: 1 month


182739 03-Sep-2008 delphij

Reflect license change of NetBSD code.

Obtained from: NetBSD
MFC after: 3 days


182371 28-Aug-2008 attilio

Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>


179808 15-Jun-2008 kib

Do not redo the vnode tear-down work already done by insmntque() when
vnode cannot be put on the vnode list for mount.

Reported and tested by: marck
Guilty party: me
MFC after: 3 days


178243 16-Apr-2008 kib

Move the head of byte-level advisory lock list from the
filesystem-specific vnode data to the struct vnode. Provide the
default implementation for the vop_advlock and vop_advlockasync.
Purge the locks on the vnode reclaim by using the lf_purgelocks().
The default implementation is augmented for the nfs and smbfs.
In the nfs_advlock, push the Giant inside the nfs_dolock.

Before the change, the vop_advlock and vop_advlockasync have taken the
unlocked vnode and dereferenced the fs-private inode data, racing with
with the vnode reclamation due to forced unmount. Now, the vop_getattr
under the shared vnode lock is used to obtain the inode size, and
later, in the lf_advlockasync, after locking the vnode interlock, the
VI_DOOMED flag is checked to prevent an operation on the doomed vnode.

The implementation of the lf_purgelocks() is submitted by dfr.

Reported by: kris
Tested by: kris, pho
Discussed with: jeff, dfr
MFC after: 2 weeks


177633 26-Mar-2008 dfr

Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.

Highlights include:

* Thread-safe kernel RPC client - many threads can use the same RPC
client handle safely with replies being de-multiplexed at the socket
upcall (typically driven directly by the NIC interrupt) and handed
off to whichever thread matches the reply. For UDP sockets, many RPC
clients can share the same socket. This allows the use of a single
privileged UDP port number to talk to an arbitrary number of remote
hosts.

* Single-threaded kernel RPC server. Adding support for multi-threaded
server would be relatively straightforward and would follow
approximately the Solaris KPI. A single thread should be sufficient
for the NLM since it should rarely block in normal operation.

* Kernel mode NLM server supporting cancel requests and granted
callbacks. I've tested the NLM server reasonably extensively - it
passes both my own tests and the NFS Connectathon locking tests
running on Solaris, Mac OS X and Ubuntu Linux.

* Userland NLM client supported. While the NLM server doesn't have
support for the local NFS client's locking needs, it does have to
field async replies and granted callbacks from remote NLMs that the
local client has contacted. We relay these replies to the userland
rpc.lockd over a local domain RPC socket.

* Robust deadlock detection for the local lock manager. In particular
it will detect deadlocks caused by a lock request that covers more
than one blocking request. As required by the NLM protocol, all
deadlock detection happens synchronously - a user is guaranteed that
if a lock request isn't rejected immediately, the lock will
eventually be granted. The old system allowed for a 'deferred
deadlock' condition where a blocked lock request could wake up and
find that some other deadlock-causing lock owner had beaten them to
the lock.

* Since both local and remote locks are managed by the same kernel
locking code, local and remote processes can safely use file locks
for mutual exclusion. Local processes have no fairness advantage
compared to remote processes when contending to lock a region that
has just been unlocked - the local lock manager enforces a strict
first-come first-served model for both local and remote lockers.

Sponsored by: Isilon Systems
PR: 95247 107555 115524 116679
MFC after: 2 weeks


176559 25-Feb-2008 attilio

Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is
always curthread.

As KPI gets broken by this patch, manpages and __FreeBSD_version will be
updated by further commits.

Tested by: Andrea Barberio <insomniac at slackware dot it>


175294 13-Jan-2008 attilio

VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>


175202 10-Jan-2008 attilio

vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by: Diego Sardina <siarodx at gmail dot com>,
Andrea Di Pasquale <whyx dot it at gmail dot com>


174384 07-Dec-2007 delphij

Turn MPASS(0) into panic with more obvious reason why the assertion
is failed.


174379 06-Dec-2007 delphij

size_max should be unsigned, as such, use size_t here.


174265 04-Dec-2007 wkoszek

Explicitly initialize 'error' to 0 (two places). It lets one to build tmpfs
from the latest source tree with older compiler--gcc3.

Reviewed by: kib@ (on freebsd-current@)
Approved by: cognet@ (mentor)


173725 18-Nov-2007 delphij

MFp4: Several fixes to tmpfs which makes it to survive from pho@'s
strees2 suite, to quote his letter, this change:

1. It removes the tn_lookup_dirent stuff. I think this cannot be fixed,
because nothing protects vnode/tmpfs node between lookup is done, and
actual operation is performed, in the case the vnode lock is dropped.
At least, this is the case with the from vnode for rename.

For now, we do the linear lookup in the parent node. This has its own
drawbacks. Not mentioning speed (that could be fixed by using hash), the
real problem is the situation where several hardlinks exist in the dvp.
But, I think this is fixable.

2. The patch restores the VV_ROOT flag on the root vnode after it became
reclaimed and allocated again. This fixes MPASS assertion at the start
of the tmpfs_lookup() reported by many.

Submitted by: kib


173724 18-Nov-2007 delphij

MFp4: Fix several style(9) bugs.

Submitted by: des


173570 12-Nov-2007 delphij

Correct a stack overflow which will trigger panics when
mode= is specified, caused by incorrect format string
specified to vfs_scanopt() and subsequently vsscanf().

Pointed out by: kib
Submitted by: des


172442 04-Oct-2007 delphij

MFp4: Provide a dummy verb "export" to shut up the message
showed up at start when NFS is enabled.

Reported by: rafan
Approved by: re (tmpfs blanket)


172441 04-Oct-2007 delphij

Additional work is still needed before we can claim that tmpfs
is stable enough for production usage. Warn user upon mount.

Approved by: re (tmpfs blanket)


171862 16-Aug-2007 delphij

MFp4: rework tmpfs_readdir() logic in terms of correctness.

Approved by: re (tmpfs blanket)
Tested with: fstest, fsx


171802 10-Aug-2007 delphij

MFp4:
- LK_RETRY prohibits vget() and vn_lock() to return error.
Remove associated code. [1]
- Properly use vhold() and vdrop() instead of their unlocked
versions, we are guaranteed to have the vnode's interlock
unheld. [1]
- Fix a pseudo-infinite loop caused by 64/32-bit arithmetic
with the same way used in modern NetBSD versions. [2]
- Reorganize tmpfs_readdir to reduce duplicated code.

Submitted by: kib [1]
Obtained from: NetBSD [2]
Approved by: re (tmpfs blanket)


171799 10-Aug-2007 delphij

MFp4:

- Respect cnflag and don't lock vnode always as LK_EXCLUSIVE [1]
- Properly lock around tn_vnode to avoid NULL deference
- Be more careful handling vnodes (*)

(*) This is a WIP
[1] by pjd via howardsu

Thanks kib@ for his valuable VFS related comments.

Tested with: fsx, fstest, tmpfs regression test set
Found by: pho's stress2 suite
Approved by: re (tmpfs blanket)


171704 03-Aug-2007 delphij

MFp4 - Refine locking to eliminate some potential race/panics:

- Copy before testing a pointer. This closes a race window.
- Use msleep with the node interlock instead of tsleep.
- Do proper locking around access to tn_vpstate.
- Assert vnode VOP lock for dir_{atta,de}tach to capture
inconsistent locking.

Suggested by: kib
Submitted by: delphij
Reviewed by: Howard Su
Approved by: re (tmpfs blanket)


171570 24-Jul-2007 delphij

MFp4: Force 64-bit arithmatic when caculating the maximum file size.
This fixes tmpfs caculations on 32-bit systems equipped with more than
4GB swap.

Reported by: Craig Boston <craig xfoil gank org>
PR: kern/114870
Approved by: re (tmpfs blanket)


171550 23-Jul-2007 delphij

MFp4: When swapping is not enabled, allow creating files by taking
physical memory pages into account for tm_maxfilesize.

Reported by: Dominique Goncalves <dominique.goncalves gmail.com>
Submitted by: Howard Su
Approved by: re (tmpfs blanket)


171489 19-Jul-2007 delphij

MFp4: Rework on tmpfs's mapped read/write procedures. This
should finally fix fsx test case.

The printf's added here would be eventually turned into
assertions.

Submitted by: Mingyan Guo (mostly)
Approved by: re (tmpfs blanket)


171362 11-Jul-2007 delphij

MFp4: Make use of the kernel unit number allocation facility
for tmpfs nodes.

Submitted by: Mingyan Guo <guomingyan gmail com>
Approved by: re (tmpfs blanket)


171308 08-Jul-2007 delphij

MFp4:
- Plug memory leak.
- Respect underlying vnode's properties rather than assuming that
the user want root:wheel + 0755. Useful for using tmpfs(5) for
/tmp.
- Use roundup2 and howmany macros instead of rolling our own version.
- Try to fix fsx -W -R foo case.
- Instead of blindly zeroing a page, determine whether we need a pagein
order to prevent data corruption.
- Fix several bugs reported by Coverity.

Submitted by: Mingyan Guo <guomingyan gmail com>, Howard Su, delphij
Coverity ID: CID 2550, 2551, 2552, 2557
Approved by: re (tmpfs blanket)


171087 29-Jun-2007 delphij

MFp4:

- Remove unnecessary NULL checks after M_WAITOK allocations.
- Use VOP_ACCESS instead of hand-rolled suser_cred()
calls. [1]
- Use malloc(9) KPI to allocate memory for string. The
optimization taken from NetBSD is not valid for FreeBSD
because our malloc(9) already act that way. [2]

Requested by: rwatson [1]
Submitted by: Howard Su [2]
Approved by: re (tmpfs blanket)


171070 28-Jun-2007 delphij

Space/style cleanups after last set of commits.

Approved by: re (tmpfs blanket)


171069 28-Jun-2007 delphij

Staticify most of fifo/vn operations, they should not
be directly exposed outside.

Approved by: re (tmpfs blanket)


171068 28-Jun-2007 delphij

Use vfs_timestamp instead of nanotime when obtaining
a timestamp for use with timekeeping.

Approved by: re (tmpfs blanket)


171067 28-Jun-2007 delphij

Reorder tf_gen and tf_id in struct tmpfs_fid. This
saves 8 bytes on amd64 architecture.

Obtained from: NetBSD
Approved by: re (tmpfs blanket)


171040 26-Jun-2007 delphij

Remove two function prototypes that are no longer used.

Approved by: re (tmpfs blanket)


171038 26-Jun-2007 delphij

- Sync with NetBSD's RCSID (HEAD preferred).
- Correct a typo.

Approved by: re (tmpfs blanket)


171029 25-Jun-2007 delphij

MFp4: Several clean-ups and improvements over tmpfs:

- Remove tmpfs_zone_xxx KPI, the uma(9) wrapper, since
they does not bring any value now.
- Use |= instead of = when applying VV_ROOT flag.
- Remove tm_avariable_nodes list. Use uma to hold the
released nodes.
- init/destory interlock mutex of node when init/fini
instead of ctor/dtor.
- Change memory computing using u_int to fix negative
value in 2G mem machine.
- Remove unnecessary bzero's
- Rely uma logic to make file id allocation harder to
guess.
- Fix some unsigned/signed related things. Make sure
we respect -o size=xxxx
- Use wire instead of hold a page.
- Pass allocate_zero to obtain zeroed pages upon first
use.

Submitted by: Howard Su
Approved by: re (tmpfs blanket, kensmith)


170922 18-Jun-2007 delphij

Use vfs_timestamp() instead of nanotime() - make it up to
the user to make decisions about how detail they wanted
timestamps to have.


170903 18-Jun-2007 delphij

MFp4: fix two locking problems:

- Hold TMPFS_LOCK while updating tm_pages_used.
- Hold vm page while doing uiomove.

This will hopefully fix all known panics.

Submitted by: Howard Su


170808 16-Jun-2007 delphij

MFp4: Add tmpfs, an efficient memory file system.

Please note that, this is currently considered as an
experimental feature so there could be some rough
edges. Consult http://wiki.freebsd.org/TMPFS for
more information.

For now, connect tmpfs to build on i386 and amd64
architectures only. Please let us know if you have
success with other platforms.

This work was developed by Julio M. Merino Vidal
for NetBSD as a SoC project; Rohit Jalan ported it
from NetBSD to FreeBSD. Howard Su and Glen Leeder
are worked on it to continue this effort.

Obtained from: NetBSD via p4
Submitted by: Howard Su (with some minor changes)
Approved by: re (kensmith)