History log of /freebsd-10.1-release/sys/fs/ext2fs/inode.h
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 272461 02-Oct-2014 gjb

Copy stable/10@r272459 to releng/10.1 as part of
the 10.1-RELEASE process.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 262723 04-Mar-2014 pfg

MFC r262623, r262667:

ext2fs: use of tab vs spaces.

Consistently use a single tab after a #define as mentioned in style(9).
Use tabs instead of space for indenting.
Fix a typo: "hash_vesion".

No functional change.


# 261311 31-Jan-2014 pfg

MFC r260988, r261034, r261120, r261235:

ext2fs: Properly the EXT4_EXTENTS and EXT4_INDEX to the inode flags.

In order to support Ext4 extents we need to pass the Ext4 inode flags
without interfering with the chflags. This is better done by using the
i_flag field in the inode and doing proper translation to the linux
ext4 equivalents.

Solve a potential corruption issue in the dirindex code. The dirindex
code can now be renabled as the problems related to it have been
solved.

Suggested by: bde
Tested by: kevlo


# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 254283 13-Aug-2013 pfg

Define ext2fs local types and use them.

Add definitions for e2fs_daddr_t, e4fs_daddr_t in addition
to the already existing e2fs_lbn_t and adjust them for ext4.
Other than making the code more readable these changes should
fix problems related to big filesystems.

Setting the proper types can be tricky so the process was
helped by looking at UFS. In our implementation, logical block
numbers can be negative and the code depends on it. In ext2,
block numbers are unsigned so it is convenient to keep
e2fs_daddr_t unsigned and use the complete 32 bits. In the
case of e4fs_daddr_t, while the value should be unsigned, for
ext4 we only need to support 48 bits so preserving an extra
bit from the sign is not an issue.

While here also drop the ext2_setblock() prototype that was
never used.

Discussed with: mckusick, bde
MFC after: 3 weeks


# 254260 12-Aug-2013 pfg

Add read-only support for extents in ext2fs.

Basic support for extents was implemented by Zheng Liu as part
of his Google Summer of Code in 2010. This support is read-only
at this time.

In addition to extents we also support the huge_file extension
for read-only purposes. This works nicely with the additional
support for birthtime/nanosec timestamps and dir_index that
have been added lately.

The implementation may not work for all ext4 filesystems as
it doesn't support some features that are being enabled by
default on recent linux like flex_bg. Nevertheless, the feature
should be very useful for migration or simple access in
filesystems that have been converted from ext2/3 or don't use
incompatible features.

Special thanks to Zheng Liu for his dedication and continued
work to support ext2 in FreeBSD.

Submitted by: Zheng Liu (lz@)
Reviewed by: Mike Ma, Christoph Mallon (previous version)
Sponsored by: Google Inc.
MFC after: 3 weeks


# 252103 23-Jun-2013 pfg

Define and use e2fs_lbn_t in ext2fs.

In line to what is done in UFS, define an internal type
e2fs_lbn_t for the logical block numbers.

This change is basically a no-op as the new type is unchanged
(int32_t) but it may be useful as bumping this may be required
for ext4fs.

Also, as pointed out by Bruce Evans:

-Use daddr_t for daddr in ext2_bmaparray(). This seems to
improve reliability with the reallocblks option.
- Add a cast to the fsbtodb() macro as in UFS.

Reviewed by: bde
MFC after: 3 days


# 251809 16-Jun-2013 pfg

Re-sort ext2fs headers to make things easier to find.

In the ext2fs driver we have a mixture of headers:

- The ext2_ prefixed headers have strong influence from NetBSD
and are carry specific ext2/3/4 information.
- The unprefixed headers are inspired on UFS and carry implementation
specific information.

Do some small adjustments so that the information is easier to
find coming from either UFS or the NetBSD implementation.

MFC after: 3 days


# 246562 08-Feb-2013 pfg

Remove unused MAXSYMLINKLEN macro.

Reviewed by: mckusick
PR: kern/175794
MFC after: 1 week


# 245820 22-Jan-2013 pfg

ext2fs: make some inode fields match the ext2 spec.

Ext2fs uses unsigned fields in its dinode struct.
FreeBSD can have negative values in some of those
fields and the inode is meant to interact with the
system so we have never respected the unsigned
nature of most of those fields.

Block numbers and the NFS generation number do
not need to be signed so redefine them as
unsigned to better match the on-disk information.

MFC after: 1 week


# 239359 17-Aug-2012 mjg

Remove unused member of struct indir (in_exists) from UFS and EXT2 code.

Reviewed by: mckusick
Approved by: trasz (mentor)
MFC after: 1 week


# 232703 08-Mar-2012 pfg

Add support for ns timestamps and birthtime to the ext2/3 driver.

When using big inodes there is sufficient space in ext3 to
keep extra resolution and birthtime (creation) timestamps.
The appropriate fields in the on-disk inode have been approved
for a long time but support for this in ext3 has not been
widely distributed.

In preparation for ext4 most linux distributions have enabled
by default such bigger inodes and some people use nanosecond
timestamps in ext3. We now support those when the inode is big
enough and while we do recognize the EXT4F_ROCOMPAT_EXTRA_ISIZE,
we maintain the extra timestamps even when they are not used.

An additional note by Bruce Evans:
We blindly accept unrepresentable tv_nsec in VOP_SETATTR(), but
all file systems have always done that. When POSIX gets around
to specifying the behaviour, it will probably require certain
rounding to the fs's resolution and not rejecting the request.
This unfortunately means that syscalls that set times can't
really tell if they succeeded without reading back the times
using stat() or similar and checking that they were set close
enough.

Reviewed by: bde
Approved by: jhb (mentor)
MFC after: 2 weeks


# 231168 07-Feb-2012 pfg

Update the data structures with some fields reserved for
ext4 but that can be used in ext3 mode.

Also adjust the internal inode to carry the birthtime,
like in UFS, which is starting to get some use when
big inodes are available.

Right now these are just placeholders for features
to come.

Approved by: jhb (mentor)
MFC after: 2 weeks


# 231075 06-Feb-2012 kib

Current implementations of sync(2) and syncer vnode fsync() VOP uses
mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which
is needed to guarantee a synchronous completion of the initiated i/o
before syscall or VOP return. Global removal of MNTK_ASYNC option is
harmful because not only i/o started from corresponding thread becomes
synchronous, but all i/o is synchronous on the filesystem which is
initiated during sync(2) or syncer activity.

Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local
thread flag to disable async i/o for current thread only. Use the
opportunity to move DOINGASYNC() macro into sys/vnode.h and
consistently use it through places which tested for MNTK_ASYNC.

Some testing demonstrated 60-70% improvements in run time for the
metadata-intensive operations on async-mounted UFS volumes, but still
with great deviation due to other reasons.

Reviewed by: mckusick
Tested by: scottl
MFC after: 2 weeks


# 221166 28-Apr-2011 jhb

Sync with several changes in UFS/FFS:
- 77115: Implement support for O_DIRECT.
- 98425: Fix a performance issue introduced in 70131 that was causing
reads before writes even when writing full blocks.
- 98658: Rename the BALLOC flags from B_* to BA_* to avoid confusion with
the struct buf B_ flags.
- 100344: Merge the BA_ and IO_ flags so so that they may both be used in
the same flags word. This merger is possible by assigning the IO_ flags
to the low sixteen bits and the BA_ flags the high sixteen bits.
- 105422: Fix a file-rewrite performance case.
- 129545: Implement IO_INVAL in VOP_WRITE() by marking the buffer as
"no cache".
- Readd the DOINGASYNC() macro and use it to control asynchronous writes.
Change i-node updates to honor DOINGASYNC() instead of always being
synchronous.
- Use a PRIV_VFS_RETAINSUGID check instead of checking cr_uid against 0
directly when deciding whether or not to clear suid and sgid bits.

Submitted by: Pedro F. Giffuni giffunip at yahoo


# 221128 27-Apr-2011 jhb

Use a private EXT2_ROOTINO constant instead of redefining ROOTINO.

Submitted by: Pedro F. Giffuni giffunip at yahoo


# 221126 27-Apr-2011 jhb

Various style fixes including using uint*_t instead of u_int*_t.

Submitted by: Pedro F. Giffuni giffunip at yahoo


# 217582 19-Jan-2011 jhb

Merge 118969 from UFS:
Eliminate the i_devvp field from the incore inodes, we can get the same
value from ip->i_ump->um_devvp.

Submitted by: Pedro F. Giffuni giffunip at yahoo
MFC after: 1 week


# 202283 14-Jan-2010 lulf

Bring in the ext2fs work done by Aditya Sarawgi during and after Google Summer
of Code 2009:

- BSDL block and inode allocation policies for ext2fs. This involves the use
FFS1 style block and inode allocation for ext2fs. Preallocation was removed
since it was GPL'd.
- Make ext2fs MPSAFE by introducing locks to per-mount datastructures.
- Fixes for kern/122047 PR.
- Various small bugfixes.
- Move out of gnu/ directory.

Sponsored by: Google Inc.
Submitted by: Aditya Sarawgi <sarawgi.aditya AT SPAMFREE gmail DOT com>


# 178243 16-Apr-2008 kib

Move the head of byte-level advisory lock list from the
filesystem-specific vnode data to the struct vnode. Provide the
default implementation for the vop_advlock and vop_advlockasync.
Purge the locks on the vnode reclaim by using the lf_purgelocks().
The default implementation is augmented for the nfs and smbfs.
In the nfs_advlock, push the Giant inside the nfs_dolock.

Before the change, the vop_advlock and vop_advlockasync have taken the
unlocked vnode and dereferenced the fs-private inode data, racing with
with the vnode reclamation due to forced unmount. Now, the vop_getattr
under the shared vnode lock is used to obtain the inode size, and
later, in the lf_advlockasync, after locking the vnode interlock, the
VI_DOOMED flag is checked to prevent an operation on the doomed vnode.

The implementation of the lf_purgelocks() is submitted by dfr.

Reported by: kris
Tested by: kris, pho
Discussed with: jeff, dfr
MFC after: 2 weeks


# 143686 16-Mar-2005 phk

Remove inode fields previously used for private inode hash tables.


# 143677 16-Mar-2005 phk

Don't hold a reference to the disk vnode for each inode.

Don't store the disk cdev in all inodes, it's only used for debugging
printfs.


# 139778 06-Jan-2005 imp

/* -> /*- for copyright notices, minor format tweaks as necessary


# 130763 20-Jun-2004 bde

Fixed misformatting of code and breaking of a comment in previous commit.


# 130585 16-Jun-2004 phk

Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.


# 128019 07-Apr-2004 imp

Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson


# 96877 18-May-2002 iedowse

Use explicitly-sized types where necessary to make ext2fs work again
after the change to a 64-bit daddr_t.


# 96596 14-May-2002 iedowse

Following a repo-copy from src/sys/ufs/ufs, rename functions and
structures etc. to ext2fs-specific names, and remove ufs-specific
code that is no longer required. As a first stage, the code will
still convert back and forth between the on-disk format and struct
inode, so the struct dinode fields have been added to struct inode
for now.

Note that these files are not yet connected to the build.


# 83899 24-Sep-2001 iedowse

The addition of i_dirhash to struct inode pushed RELENG_4's
sizeof(struct inode) into a new malloc bucket on the i386. This
didn't happen in -current due to the removal of i_lock, but it does
no harm to apply the workaround to -current first.

Reduce the size of the i_spare[] array in struct inode from 4 to
3 entries, and change ext2fs to use i_din.di_spare[1] so that it
does not need i_spare[3].

Reviewed by: bde
MFC after: 3 days


# 79561 10-Jul-2001 iedowse

Bring in dirhash, a simple hash-based lookup optimisation for large
directories. When enabled via "options UFS_DIRHASH", in-core hash
arrays are maintained for large directories. These allow all
directory operations to take place quickly instead of requiring
long linear searches. For now anyway, dirhash is not enabled by
default.

The in-core hash arrays have a memory requirement that is approximately
half the size of the size of the on-disk directory file. A number
of new sysctl variables allow control over which directories get
hashed and over the maximum amount of memory that dirhash will use:

vfs.ufs.dirhash_minsize
The minimum on-disk directory size for which hashing should be
used. The default is 2560 (2.5k).

vfs.ufs.dirhash_maxmem
The system-wide maximum total memory to be used by dirhash data
structures. The default is 2097152 (2MB).

The current amount of memory being used by dirhash is visible
through the read-only sysctl variable vfs.ufs.dirhash_maxmem.
Finally, some extra sanity checks that are enabled by default, but
which may have an impact on performance, can be disabled by setting
vfs.ufs.dirhash_docheck to 0.

Discussed on: -fs, -hackers


# 76357 08-May-2001 mckusick

When running with soft updates, track the number of blocks and files
that are committed to being freed and reflect these blocks in the
counts returned by statfs (and thus also by the `df' command). This
change allows programs such as those that do news expiration to
know when to stop if they are trying to create a certain percentage
of free space. Note that this change does not solve the much harder
problem of making this to-be-freed space available to applications
that want it (thus on a nearly full filesystem, you may still
encounter out-of-space conditions even though the free space will
show up eventually). Hopefully this harder problem will be the
subject of a future enhancement.


# 73942 07-Mar-2001 mckusick

Fixes to track snapshot copy-on-write checking in the specinfo
structure rather than assuming that the device vnode would reside
in the FFS filesystem (which is obviously a broken assumption with
the device filesystem).


# 66355 25-Sep-2000 bp

Add a lock structure to vnode structure. Previously it was either allocated
separately (nfs, cd9660 etc) or keept as a first element of structure
referenced by v_data pointer(ffs). Such organization leads to known problems
with stacked filesystems.

From this point vop_no*lock*() functions maintain only interlock lock.
vop_std*lock*() functions maintain built-in v_lock structure using lockmgr().
vop_sharedlock() is compatible with vop_stdunlock(), but maintains a shared
lock on vnode.

If filesystem wishes to export lockmgr compatible lock, it can put an address
of this lock to v_vnlock field. This indicates that the upper filesystem
can take advantage of it and use single lock structure for entire (or part)
of stack of vnodes. This field shouldn't be examined or modified by VFS code
except for initialization purposes.

Reviewed in general by: mckusick


# 62976 11-Jul-2000 mckusick

Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).


# 60938 26-May-2000 jake

Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by: msmith and others


# 60833 23-May-2000 jake

Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by: phk
Reviewed by: phk
Approved by: mdodd


# 55206 29-Dec-1999 peter

Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.


# 50477 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 48801 13-Jul-1999 mckusick

Create the macro DOINGASYNC to check whether the MNT_ASYNC flag has
been set for a mount point. Insert missing checks to ensure that all
write operations are done asynchronously when the MNT_ASYNC option
has been requested.

Submitted by: Craig A Soules <soules+@andrew.cmu.edu>
Reviewed by: Kirk McKusick <mckusick@mckusick.com>


# 44512 06-Mar-1999 bde

Don't depend on <ufs/ufs/quota.h> or another (old) prerequisite including
<sys/queue.h>. This fixes my recent breakage of biosboot by unpolluting
<ufs/ufs/quota.h> in the !KERNEL case.


# 40304 13-Oct-1998 bde

Fixed bloatage of `struct inode'. We used 5 "spare" fields for ext2fs,
but when i_effnlink was added to support soft updates, there was only
room for 4 spares. The number of spares was not reduced, so the inode
size became 260 (on i386's), or 512 after rounding up by malloc().
Use one spare field in `struct dinode' instead of the 5th spare field
in the inode and reduced to 4 spares in the inode so that the size is
256 again.

Changed the types of the spares in the inode from int to u_int32_t
so that the inode size has more chance of being <= 256 under other
arches, and downdated ext2fs to match (it was broken to use ints
before rev.1.1).


# 37363 03-Jul-1998 bde

Sync timestamp changes for inodes of special files to disk as late
as possible (when the inode is reclaimed). Temporarily only do
this if option UFS_LAZYMOD configured and softupdates aren't enabled.
UFS_LAZYMOD is intentionally left out of /sys/conf/options.

This is mainly to avoid almost useless disk i/o on battery powered
machines. It's silly to write to disk (on the next sync or when the
inode becomes inactive) just because someone hit a key or something
wrote to the screen or /dev/null.

PR: 5577
Previous version reviewed by: phk


# 34901 26-Mar-1998 phk

Add two new functions, get{micro|nano}time.

They are atomic, but return in essence what is in the "time" variable.
gettime() is now a macro front for getmicrotime().

Various patches to use the two new functions instead of the various
hacks used in their absence.

Some puntuation and grammer patches from Bruce.

A couple of XXX comments.


# 34266 08-Mar-1998 julian

Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman)
Submitted by: Kirk McKusick (mcKusick@mckusick.com)
Obtained from: WHistle development tree


# 32889 30-Jan-1998 phk

Retire LFS.

If you want to play with it, you can find the final version of the
code in the repository the tag LFS_RETIREMENT.

If somebody makes LFS work again, adding it back is certainly
desireable, but as it is now nobody seems to care much about it,
and it has suffered considerable bitrot since its somewhat haphazard
integration.

R.I.P


# 31557 05-Dec-1997 jkh

Needs to include <sys/lock.h> if we're using struct lock.


# 30513 17-Oct-1997 phk

Make a set of VOP standard lock, unlock & islocked VOP operators, which
depend on the lock being located at vp->v_data. Saves 3x3 identical
vop procs, more as the other filesystems becomes lock aware.


# 30418 14-Oct-1997 phk

I think my previous change may have opened a race conditio.
This patch does the same thing, with no change in semantics.


# 27375 13-Jul-1997 bde

Fixed comment about i_spare.


# 26001 22-May-1997 phk

Shrink struct inode by 20 bytes, so that malloc wastes less space.

Pointed out by: bde


# 24477 01-Apr-1997 bde

Removed nested include of <ufs/ufs/dir.h>. Use the pre-Lite2 hack of
defining doff_t both here and in <ufs/ufs/dir.h> so that this file
is independent of <ufs/ufs/dir.h>. It still has old prerequisites
<sys/param.h> and <ufs/ufs/quota.h>, and a new Lite2 prerequisite of
<sys/lock.h>, sigh.

This might fix lsof, which was broken by namespace pollution giving
conflicting definitions of DIRBLKSIZ.


# 23347 03-Mar-1997 bde

Removed unused flag IN_RECURSE and unused struct member i_lockcount.


# 22975 22-Feb-1997 peter

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 22521 10-Feb-1997 dyson

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 18397 19-Sep-1996 nate

In sys/time.h, struct timespec is defined as:

/*
* Structure defined by POSIX.4 to be like a timeval.
*/
struct timespec {
time_t ts_sec; /* seconds */
long ts_nsec; /* and nanoseconds */
};

The correct names of the fields are tv_sec and tv_nsec.

Reminded by: James Drobina <jdrobina@infinet.com>


# 13765 30-Jan-1996 mpp

Fix a bunch of spelling errors in the comment fields of
a bunch of system include files.


# 12117 05-Nov-1995 dyson

Changes to existing files for ext2fs support. The UFS mods need rework
in the future as they are a bit crufty -- but at least the stuff is in the
tree now.


# 8876 30-May-1995 rgrimes

Remove trailing whitespace.


# 8041 24-Apr-1995 dyson

Changes to get rid of ufslk2 hangs when doing read/write to/from
mmap regions that are in the same file as the read/write.


# 5247 27-Dec-1994 bde

Use the same current time throughout ITIMES(). I want all current
timestamps for an atomic operation such as rename() on a local file
system to be identical.

Uniformize yet another idempotency ifdef. The comment nesting was
bogus.


# 2177 21-Aug-1994 paul

Made idempotent
Reviewed by:
Submitted by:


# 1817 02-Aug-1994 dg

Added $Id$


# 1541 24-May-1994 rgrimes

BSD 4.4 Lite Kernel Sources