272461 |
03-Oct-2014 |
gjb |
Copy stable/10@r272459 to releng/10.1 as part of the 10.1-RELEASE process.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
272116 |
25-Sep-2014 |
trasz |
MFC r272025:
Fix thinko that, with two map entries like shown below, in that order, made autofs mix them up: the second one wasn't visible in ls(1) output, and trying to access it would trigger mount for the first one.
foobar host:/foobar foo host:/foo
Approved by: re (gjb) Sponsored by: The FreeBSD Foundation
|
270904 |
31-Aug-2014 |
trasz |
MFC r270507:
Fix bug that, assuming a/ is a root of NFS filesystem mounted on autofs, prevented "mv a/from a/to" from working, while "cd a && mv from to" was ok.
PR: 192948 Sponsored by: The FreeBSD Foundation
|
270900 |
31-Aug-2014 |
trasz |
MFC r270402:
Autofs softc needs to be global anyway, so don't pass it as a local variable, and don't store in autofs_mount. Also rename it from 'sc' to 'autofs_softc', since it's global and extern.
Sponsored by: The FreeBSD Foundation
|
270899 |
31-Aug-2014 |
trasz |
MFC r270399:
Add comment explaining one of the quirks in autofs.
Sponsored by: The FreeBSD Foundation
|
270898 |
31-Aug-2014 |
trasz |
MFC r270281:
Fix includes.
Suggested by: pluknet@ Sponsored by: The FreeBSD Foundation
|
270897 |
31-Aug-2014 |
trasz |
MFC r270276:
Use __FBSDID() properly.
Suggested by: pluknet@ Sponsored by: The FreeBSD Foundation
|
270894 |
31-Aug-2014 |
trasz |
MFC r270207:
Rework ".." lookup; previous one failed to properly busy the mountpoint.
Reviewed by: kib@ Sponsored by: The FreeBSD Foundation
|
270892 |
31-Aug-2014 |
trasz |
MFC r270096:
Bring in the new automounter, similar to what's provided in most other UNIX systems, eg. MacOS X and Solaris. It uses Sun-compatible map format, has proper kernel support, and LDAP integration.
There are still a few outstanding problems; they will be fixed shortly.
Reviewed by: allanjude@, emaste@, kib@, wblock@ (earlier versions) Phabric: D523 Relnotes: yes Sponsored by: The FreeBSD Foundation
|
270319 |
22-Aug-2014 |
kib |
MFC r269708: Unlock ldvp and lock dvp to compensate for possible ldvp unlock in lower VOP_LOOKUP() and dvp reclamation. Use cached value of dvp->v_mount.
|
270066 |
16-Aug-2014 |
rmacklem |
MFC: r269771 Change the NFS server's printf related to hitting the DRC cache's flood level so that it suggests increasing vfs.nfsd.tcphighwater.
|
269655 |
07-Aug-2014 |
kib |
MFC r269347: Do not generate 1000 unique lock names for nfsrc hash chain locks. Shorten the names of some nfs mutexes.
|
269493 |
04-Aug-2014 |
kib |
MFC r269187: Assert that nullfs vnode has VV_ROOT set whenever lower vnode has. Assert that dotdot lookup on the root vnode is not performed.
|
269452 |
03-Aug-2014 |
rmacklem |
MFC: r268273 The new NFSv3 server did not generate directory postop attributes for the reply to ReaddirPlus when the server failed within the loop that calls VFS_VGET(). This failure is most likely an error return from VFS_VGET() caused by a bogus d_fileno that was truncated to 32bits. This patch fixes the server so that it will return directory postop attributes for the failure. It does not fix the underlying issue caused by d_fileno being uint32_t when a file system like ZFS generates a fileno that is greater than 32bits.
|
269398 |
01-Aug-2014 |
rmacklem |
MFC: r268115 Merge the NFSv4.1 server code in projects/nfsv4.1-server over into head. The code is not believed to have any effect on the semantics of non-NFSv4.1 server behaviour. It is a rather large merge, but I am hoping that there will not be any regressions for the NFS server.
|
269284 |
30-Jul-2014 |
kib |
MFC r268765: Remove unused header.
|
269283 |
30-Jul-2014 |
kib |
MFC r268764: Check for the cross-device cross-link attempt in the VFS, instead of VOP_LINK() implemenations.
|
269176 |
28-Jul-2014 |
kib |
MFC r268766: Do not ignore error from tmpfs_alloc_vp().
|
269175 |
28-Jul-2014 |
kib |
MFC r268617: Rework the tmpfs unmount.
|
269174 |
28-Jul-2014 |
kib |
MFC r268615: Add OBJ_TMPFS_NODE flag.
MFC r268616: Set the OBJ_TMPFS_NODE flag for vm_object of VREG tmpfs node.
MFC r269053: Correct assertion. tmpfs vm object is always at the bottom of the shadow chain.
|
269173 |
28-Jul-2014 |
kib |
MFC r268614: Use tmpfs_vn_get_ino_gen() to handle the races with reclaim in tmpfs dotdot lookup.
|
269172 |
28-Jul-2014 |
kib |
MFC r268613: Style. Add comment about lock mode.
|
269170 |
28-Jul-2014 |
kib |
MFC r268611: Replace goto's with the return.
|
269169 |
28-Jul-2014 |
kib |
MFC r268610: Add convenience macro to assert tmpfs node lock.
|
269168 |
28-Jul-2014 |
kib |
MFC r268609: Add some assertions for the code handling vm_object for tmpfs vnode.
|
269167 |
28-Jul-2014 |
kib |
MFC r268608: The tmpfs_link() must not dereference the filesystem-specific data for a vnode until it is verified that the vnode indeed belongs to tmpfs mount.
|
269165 |
28-Jul-2014 |
kib |
MFC r268606: Generalize vn_get_ino() to allow filesystems to use custom vnode producer. Convert inline copies of vn_get_ino() in msdosfs and cd9660 into the uses of vn_get_ino_gen().
|
269164 |
28-Jul-2014 |
kib |
MFC r268605: Remove code separator lines which do not conform to style(9).
|
269156 |
27-Jul-2014 |
kib |
MFC r269081: Fix typo.
|
268961 |
21-Jul-2014 |
bdrewery |
MFC r268114:
Change NFS readdir() to only ignore cookies preceding the given offset for UFS rather than for all but ZFS.
|
268580 |
13-Jul-2014 |
rmacklem |
MFC: r268008 There might be a potential race condition for the NFSv4 client when a newly created file has another open done on it that update the open mode. This patch moves the code that updates the open mode up into the block where the mutex is held to ensure this cannot happen. No bug caused by this potential race has been observed, but this fix is a safety belt to ensure it cannot happen.
|
268335 |
06-Jul-2014 |
mjg |
MFC r265206:
Ignore the error from pipespace_new when creating a pipe.
It can fail if pipe map is exhausted (as a result of too many pipes created), but it is not fatal and could be provoked by unprivileged users. The only consequence is worse performance with given pipe.
|
267816 |
24-Jun-2014 |
kib |
MFC r267564: In msdosfs_setattr(), add a check for result of the utimes(2) permissions test. Refactor the permission checks for utimes(2).
|
267346 |
11-Jun-2014 |
kib |
MFC r267060: Allow shared locking for the tmpfs vnode.
|
267343 |
10-Jun-2014 |
rmacklem |
MFC: r267191 The new NFS server would not allow a hard link to be created to a symlink. This restriction (which was inherited from OpenBSD) is not required by the NFS RFCs. Since this is allowed by the old NFS server, it is a POLA violation to not allow it. This patch modifies the new NFS server to allow this.
|
265807 |
10-May-2014 |
kib |
MFC r265275: Overwrite the de_Name for the directories on rename to correct the dot name.
|
265714 |
08-May-2014 |
rmacklem |
MFC: r265252 The new draft specification for NFSv4.0 specifies that a server should either accept owner and owner_group strings that are just the digits of the uid/gid or return NFS4ERR_BADOWNER. This patch adds a sysctl vfs.nfsd.enable_stringtouid, which can be set to enable the server w.r.t. accepting numeric string. It also ensures that NFS4ERR_BADOWNER is returned if numeric uid/gid strings are not enabled. This fixes the server for recent Linux nfs4 clients that use numeric uid/gid strings by default.
|
265667 |
08-May-2014 |
rmacklem |
MFC: r264888 The PR reported that the old NFS server did not set uio_td == NULL for the VOP_READ() call. This patch fixes both the old and new server for this case.
|
265621 |
07-May-2014 |
rmacklem |
MFC: r264845 Remove an unnecessary level of indirection for an argument. This simplifies the code and should avoid the clang sparc port from generating an abort() call.
|
265620 |
07-May-2014 |
rmacklem |
MFC: r264842 Modify the NFSv4 client's Pathconf RPC (actually a Getattr Op.) so that it only does the RPC for names that are answered by the RPC. Doing the RPC for other names is harmless, but unnecessary.
|
265470 |
06-May-2014 |
rmacklem |
MFC: r264738 For an NFSv4 mount with the "nocto" option, don't get the up to date file attributes upon close. This reduces the Getattr RPC count by about 65% for software builds.
|
265469 |
06-May-2014 |
rmacklem |
MFC: r264705, r264749 Modify the NFSv4 client create/mkdir RPC so that it acquires post-create/mkdir directory attributes. This allows the RPC to name cache the newly created directory and reduces the lookup RPC count for applications creating a lot of directories.
|
265466 |
06-May-2014 |
rmacklem |
MFC: r264681 Modify the NFSv4 client open/create RPC so that it acquires post-open/create directory attributes. This allows the RPC to name cache the newly created file and reduces the lookup RPC count by about 10% for software builds.
|
265434 |
06-May-2014 |
rmacklem |
MFC: r264672 Modify the Lookup RPC for NFSv4 so that it acquires directory attributes. This allows the client to cache directory names when they are looked up, reducing the Lookup RPC count by about 40% for software builds.
|
265243 |
02-May-2014 |
ae |
MFC r264494: Use SMB_QUERY_FS_SIZE_INFO request to populate statfs structure. When server doesn't support this request, try to use SMB_INFO_ALLOCATION. And use SMB_COM_QUERY_INFORMATION_DISK request as fallback.
MFC r264600: Remove redundant unlock.
This code was removed from the opensolaris and darwin's netsmb implementations, in DfBSD it also has been disabled.
|
264266 |
08-Apr-2014 |
delphij |
Fix NFS deadlock vulnerability. [SA-14:05]
Fix "Heartbleed" vulnerability and ECDSA Cache Side-channel Attack in OpenSSL. [SA-14:06]
|
263946 |
30-Mar-2014 |
bdrewery |
MFC r263131,r263174,r263175:
Tmpfs readdir() redundant logic and code readability cleanup.
r263131: Cleanup redundant logic and add some comments to help explain how it works in lieu of potentially less clear code.
r263174: Rename cnt to maxcookies and change its use as the condition for when to lookup cookies to be less obscure.
r263175: Add missing FALLTHROUGH comment in tmpfs_dir_getdents for looking up '.' and '..'.
|
263943 |
30-Mar-2014 |
bdrewery |
MFC r263130:
Fix -o size less than PAGE_SIZE resulting in SIZE_MAX being used.
|
263670 |
23-Mar-2014 |
pfg |
MFC: r263441:
msdosfs: minor format fix - spaces vs tab
|
262943 |
09-Mar-2014 |
pfg |
MFC r262869:
ext2fs: Fix a bug when sorting htree entries.
This a typo introduced when bringing the original code from NetBSD.
Reported by: Mike Ma
|
262723 |
04-Mar-2014 |
pfg |
MFC r262623, r262667:
ext2fs: use of tab vs spaces.
Consistently use a single tab after a #define as mentioned in style(9). Use tabs instead of space for indenting. Fix a typo: "hash_vesion".
No functional change.
|
262563 |
27-Feb-2014 |
pfg |
MFC r262346: ext2fs: fully enable ext4 read-only support.
The ext4 developers tend to tag Ext4-specific flags as "incompatible" even when such features are not relevant for read-only support. This is a consequence of the process though which this filesystem is implemented without design and the fact that some new features are not extensible to ext2/3.
Organize the features according to what we support and sort them so that we can now read-only mount filesystems with some features that may be found in newly formatted ext4 fs.
Submitted by: Zheng Liu
|
262211 |
19-Feb-2014 |
dim |
MFC r261914:
In sys/fs/nandfs/nandfs_vfsops.c, #if 0 an unused static function.
|
261313 |
31-Jan-2014 |
pfg |
MFC r261136:
ext2fs: Re-enable reallocblk.
The major corruption issues affecting this code have been fixed.
Tested by: Mike Ma
|
261311 |
31-Jan-2014 |
pfg |
MFC r260988, r261034, r261120, r261235:
ext2fs: Properly the EXT4_EXTENTS and EXT4_INDEX to the inode flags.
In order to support Ext4 extents we need to pass the Ext4 inode flags without interfering with the chflags. This is better done by using the i_flag field in the inode and doing proper translation to the linux ext4 equivalents.
Solve a potential corruption issue in the dirindex code. The dirindex code can now be renabled as the problems related to it have been solved.
Suggested by: bde Tested by: kevlo
|
261055 |
22-Jan-2014 |
mav |
MFC r260229, r260258, r260367, r260390, r260459, r260648: Rework NFS Duplicate Request Cache cleanup logic.
- Introduce additional hash to group requests by hash of sockref. This allows to process TCP acknowledgements without looping though all the cache, and as result allows to do it every time. - Indroduce additional callbacks to notify application layer about sockets disconnection. Without this last few requests processed just before socket disconnection never processed their ACKs and stuck in cache for many hours. - Implement transport-specific method for tracking reply acknowledgements. New implementation does not cross multiple stack layers to get the data and does not have race conditions that previously made some requests stuck in cache. This could be done more efficiently at sockbuf layer, but that would broke some KBIs, while I don't know other consumers for it aside NFS. - Instead of traversing all DRC twice per request, run cleaning only once per request, and except in some conditions traverse only single hash slot at a time.
Together this limits NFS DRC growth only to situations of real connectivity problems. If network is working well, and so all replies are acknowledged, cache remains almost empty even after hours of heavy load. Without this change on the same test cache was growing to many thousand requests even with perfectly working local network.
As another result this reduces CPU time spent on the DRC handling during SPEC NFS benchmark from about 10% to 0.5%.
Sponsored by: iXsystems, Inc.
|
261051 |
22-Jan-2014 |
mav |
MFC r259877: Slightly simplify expiration logic introduced in r254337.
- Do not update the histogram for items we are any way deleting from cache. - Do not update the histogram if nfsrc_tcphighwater is not set. - Remove some extra math operations.
|
261049 |
22-Jan-2014 |
mav |
MFC r259765: Fix RPC server threads file handle affinity to work better with ZFS.
Instead of taking 8 specific bytes of file handle to identify file during RPC thread affitinity handling, use trivial hash of the full file handle. ZFS's struct zfid_short does not have padding field after the length field, as result, originally picked 8 bytes are loosing lower 16 bits of object ID, causing many false matches and unneeded requests affinity to same thread. This fix substantially improves NFS server latency and scalability in SPEC NFS benchmark by more flexible use of multiple NFS threads.
|
260629 |
14-Jan-2014 |
pfg |
MFC r260545:
ext2fs: fix inode flag conversion.
After r252890 we are naively attempting to pass through the inode flags. This is technically incorrect as the ext2 inode flags don't match the UFS/system values used in FreeBSD and a clean conversion is needed.
Some filtering was left in place so the change didn't cause significant changes in FreeBSD but some of the garbage passed is likely to be the cause for warning messages in linux.
Fix the issue by resetting the flags before conversion as was done previously. This also means we will not pass the EXT4_* inode flags into FreeBSD's inode.
PR: kern/185448
|
260159 |
01-Jan-2014 |
rmacklem |
MFC: r259854 The NFSv4 server would call VOP_SETATTR() with a shared locked vnode when a Getattr for a file is done by a client other than the one that holds the file's delegation. This would only happen when delegations are enabled and the problem is fixed by this patch.
|
260144 |
31-Dec-2013 |
rmacklem |
MFC: r259845 An intermittent problem with NFSv4 exporting of ZFS snapshots was reported to the freebsd-fs mailing list. I believe the problem was caused by the Readdir operation using VFS_VGET() for a snapshot file entry instead of VOP_LOOKUP(). This would not occur for NFSv3, since it will do a VFS_VGET() of "." which fails with ENOTSUPP at the beginning of the directory, whereas NFSv4 does not check "." or "..". This patch adds a call to VFS_VGET() for the directory being read to check for ENOTSUPP. I also observed that the mount_on_fileid and fsid attributes were not correct at the snapshot's auto mountpoints when looking at packet traces for the Readdir. This patch fixes the attributes by doing a check for different v_mount structure, even if the vnode v_mountedhere is not set.
|
260143 |
31-Dec-2013 |
rmacklem |
MFC: r259801 The NFSv4 client was passing both the p and cred arguments to nfsv4_fillattr() as NULLs for the Getattr callback. This caused nfsv4_fillattr() to not fill in the Change attribute for the reply. I believe this was a violation of the RFC, but had little effect on server behaviour. This patch passes a non-NULL p argument to fix this.
|
260109 |
30-Dec-2013 |
rmacklem |
MFC: r259771 The NFSv4.1 client didn't return NFSv4.1 specific error codes for the Getattr and Recall callbacks. This patch fixes it. Since the NFSv4.1 specific error codes would only happen for abnormal circumstances, this patch has little effect, in practice.
|
260107 |
30-Dec-2013 |
rmacklem |
MFC: r259084 For software builds, the NFS client does many small synchronous (with FILE_SYNC) writes because non-contiguous byte ranges in the same buffer cache block are being written. This patch adds a new mount option "noncontigwr" which allows the non-contiguous byte ranges to be combined, with the dirty byte range becoming the superset of the bytes that are dirty, if the file has not been file locked. This reduces the number of writes significantly for software builds. The only case where this change might break existing applications is where an application is writing non-overlapping byte ranges within the same buffer cache block of a file from multiple clients concurrently. Since such an application would normally do file locking on the file, avoiding the byte range merge for files that have been file locked should be sufficient for most (maybe all?) cases.
|
259904 |
26-Dec-2013 |
pfg |
MFC r258904, r259780: Small ext2fs updates.
Add two new reserved inodes. Make the hashing algorithm match the linux code.
PR: kern/183230
|
259814 |
24-Dec-2013 |
kib |
MFC r259521: Do not allow O_EXEC opens for fifo, return EINVAL.
|
259506 |
17-Dec-2013 |
kib |
MFC r258088: Add check for buflen overflow by comparing the buflen with both offset and resid.
MFC r258397: Redo r258088 to avoid relying on signed arithmetic overflow.
|
259238 |
11-Dec-2013 |
rmacklem |
MFC: r257901 Fix an NFSv4.1 client specific case where a forced dismount would hang. The hang occurred in nfsv4_setsequence() when it couldn't find an available session slot and is fixed by checking for a forced dismount in progress and just returning for this case.
|
259223 |
11-Dec-2013 |
pfg |
MFC r256448, r257029;
Make di_blocks unsigned in UFS1 as is the case already for UFS2. Most of the code between UFS1 and UFS2 is shared so this change is pretty safe. Not only this makes UFS1 and 2 consistent but it also matches what NetBSD and MacOS X have for some years now.
UFS2: make di_extsize unsigned. di_extsize is the EA size and as such it should be unsigned. Adjust related types for consistency.
Reviewed by: mckusick
|
259207 |
11-Dec-2013 |
rmacklem |
MFC: r257598 During code inspection, I spotted that there was a code path where CLNT_CONTROL() would be called on "client" after it was released via CLNT_RELEASE(). It was unlikely that this code path gets executed and I have not heard of any problem report caused by this bug. This patch fixes the code so that this cannot happen.
|
257122 |
25-Oct-2013 |
kib |
MFC r256502: Similar to debug.iosize_max_clamp sysctl, introduce devfs_iosize_max_clamp sysctl, which allows/disables SSIZE_MAX-sized i/o requests on the devfs files.
Approved by: re (glebius)
|
257121 |
25-Oct-2013 |
kib |
MFC r256501: Remove two instances of ARGSUSED comment, and wrap lines nearby the code that is to be changed.
Approved by: re (glebius)
|
256281 |
10-Oct-2013 |
gjb |
Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
255867 |
25-Sep-2013 |
jmg |
NULL stale pointers (should be a no-op as they should no longer be used)...
Reviewed by: dteske Approved by: re (kib) Sponsored by: Vicor MFC after: 3 days
|
255866 |
25-Sep-2013 |
jmg |
fix a bug where we access a bread buffer after we have brelse'd it... The kernel normally didn't unmap/context switch away before we accessed the buffer most of the time, but under heavy I/O pressure and lots of mount/unmounting this would cause a fault on nofault panic...
Reviewed by: dteske Approved by: re (kib) Sponsored by: Vicor MFC after: 3 days
|
255442 |
10-Sep-2013 |
des |
Fix the length calculation for the final block of a sendfile(2) transmission which could be tricked into rounding up to the nearest page size, leaking up to a page of kernel memory. [13:11]
In IPv6 and NetATM, stop SIOCSIFADDR, SIOCSIFBRDADDR, SIOCSIFDSTADDR and SIOCSIFNETMASK at the socket layer rather than pass them on to the link layer without validation or credential checks. [SA-13:12]
Prevent cross-mount hardlinks between different nullfs mounts of the same underlying filesystem. [SA-13:13]
Security: CVE-2013-5666 Security: FreeBSD-SA-13:11.sendfile Security: CVE-2013-5691 Security: FreeBSD-SA-13:12.ifioctl Security: CVE-2013-5710 Security: FreeBSD-SA-13:13.nullfs Approved by: re
|
255338 |
07-Sep-2013 |
pfg |
ext2fs: temporarily disable htree directory index.
Our code does not consider yet the case of hash collisions. This is a rather annoying situation where two or more files that happen to have the same hash value will not appear accessible.
The situation is not difficult to work-around but given that things will just work without enabling htree we will save possible embarrassments for the next release.
Reported by: Kevin Lo
|
255240 |
05-Sep-2013 |
pjd |
Handle cases where capability rights are not provided.
Reported by: kib
|
255219 |
05-Sep-2013 |
pjd |
Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way.
The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough.
The structure definition looks like this:
struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; };
The initial CAP_RIGHTS_VERSION is 0.
The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements.
The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future.
To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg.
#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)
We still support aliases that combine few rights, but the rights have to belong to the same array element, eg:
#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)
#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)
There is new API to manage the new cap_rights_t structure:
cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...);
bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);
Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg:
cap_rights_t rights;
cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);
There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg:
#define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...);
Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1:
cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);
Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition.
This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x.
Sponsored by: The FreeBSD Foundation
|
255216 |
04-Sep-2013 |
rmacklem |
Crashes have been observed for NFSv4.1 mounts when the system is being shut down which were caused by the nfscbd_pool being destroyed before the backchannel is disabled. This patch is believed to fix the problem, by simply avoiding ever destroying the nfscbd_pool. Since the NFS client module cannot be unloaded, this should not cause a memory leak.
MFC after: 2 weeks
|
255136 |
01-Sep-2013 |
rmacklem |
Forced dismounts of NFS mounts can fail when thread(s) are stuck waiting for an RPC reply from the server while holding the mount point busy (mnt_lockref incremented). This happens because dounmount() msleep()s waiting for mnt_lockref to become 0, before calling VFS_UNMOUNT(). This patch adds a new VFS operation called VFS_PURGE(), which the NFS client implements as purging RPCs in progress. Making this call before checking mnt_lockref fixes the problem, by ensuring that the VOP_xxx() calls will fail and unbusy the mount point.
Reported by: sbruno Reviewed by: kib MFC after: 2 weeks
|
255008 |
28-Aug-2013 |
ken |
Support storing 7 additional file flags in tmpfs:
UF_SYSTEM, UF_SPARSE, UF_OFFLINE, UF_REPARSE, UF_ARCHIVE, UF_READONLY, and UF_HIDDEN.
Sort the file flags tmpfs supports alphabetically. tmpfs now supports the same flags as UFS, with the exception of SF_SNAPSHOT.
Reported by: bdrewery, antoine Sponsored by: Spectra Logic
|
254925 |
26-Aug-2013 |
jhb |
Remove most of the remaining sysctl name list macros. They were only ever intended for use in sysctl(8) and it has not used them for many years.
Reviewed by: bde Tested by: exp-run by bdrewery
|
254741 |
23-Aug-2013 |
delphij |
Allow tmpfs be mounted inside jail.
|
254627 |
21-Aug-2013 |
ken |
Expand the use of stat(2) flags to allow storing some Windows/DOS and CIFS file attributes as BSD stat(2) flags.
This work is intended to be compatible with ZFS, the Solaris CIFS server's interaction with ZFS, somewhat compatible with MacOS X, and of course compatible with Windows.
The Windows attributes that are implemented were chosen based on the attributes that ZFS already supports.
The summary of the flags is as follows:
UF_SYSTEM: Command line name: "system" or "usystem" ZFS name: XAT_SYSTEM, ZFS_SYSTEM Windows: FILE_ATTRIBUTE_SYSTEM
This flag means that the file is used by the operating system. FreeBSD does not enforce any special handling when this flag is set.
UF_SPARSE: Command line name: "sparse" or "usparse" ZFS name: XAT_SPARSE, ZFS_SPARSE Windows: FILE_ATTRIBUTE_SPARSE_FILE
This flag means that the file is sparse. Although ZFS may modify this in some situations, there is not generally any special handling for this flag.
UF_OFFLINE: Command line name: "offline" or "uoffline" ZFS name: XAT_OFFLINE, ZFS_OFFLINE Windows: FILE_ATTRIBUTE_OFFLINE
This flag means that the file has been moved to offline storage. FreeBSD does not have any special handling for this flag.
UF_REPARSE: Command line name: "reparse" or "ureparse" ZFS name: XAT_REPARSE, ZFS_REPARSE Windows: FILE_ATTRIBUTE_REPARSE_POINT
This flag means that the file is a Windows reparse point. ZFS has special handling code for reparse points, but we don't currently have the other supporting infrastructure for them.
UF_HIDDEN: Command line name: "hidden" or "uhidden" ZFS name: XAT_HIDDEN, ZFS_HIDDEN Windows: FILE_ATTRIBUTE_HIDDEN
This flag means that the file may be excluded from a directory listing if the application honors it. FreeBSD has no special handling for this flag.
The name and bit definition for UF_HIDDEN are identical to the definition in MacOS X.
UF_READONLY: Command line name: "urdonly", "rdonly", "readonly" ZFS name: XAT_READONLY, ZFS_READONLY Windows: FILE_ATTRIBUTE_READONLY
This flag means that the file may not written or appended, but its attributes may be changed.
ZFS currently enforces this flag, but Illumos developers have discussed disabling enforcement.
The behavior of this flag is different than MacOS X. MacOS X uses UF_IMMUTABLE to represent the DOS readonly permission, but that flag has a stronger meaning than the semantics of DOS readonly permissions.
UF_ARCHIVE: Command line name: "uarch", "uarchive" ZFS_NAME: XAT_ARCHIVE, ZFS_ARCHIVE Windows name: FILE_ATTRIBUTE_ARCHIVE
The UF_ARCHIVED flag means that the file has changed and needs to be archived. The meaning is same as the Windows FILE_ATTRIBUTE_ARCHIVE attribute, and the ZFS XAT_ARCHIVE and ZFS_ARCHIVE attribute.
msdosfs and ZFS have special handling for this flag. i.e. they will set it when the file changes.
sys/param.h: Bump __FreeBSD_version to 1000047 for the addition of new stat(2) flags.
chflags.1: Document the new command line flag names (e.g. "system", "hidden") available to the user.
ls.1: Reference chflags(1) for a list of file flags and their meanings.
strtofflags.c: Implement the mapping between the new command line flag names and new stat(2) flags.
chflags.2: Document all of the new stat(2) flags, and explain the intended behavior in a little more detail. Explain how they map to Windows file attributes.
Different filesystems behave differently with respect to flags, so warn the application developer to take care when using them.
zfs_vnops.c: Add support for getting and setting the UF_ARCHIVE, UF_READONLY, UF_SYSTEM, UF_HIDDEN, UF_REPARSE, UF_OFFLINE, and UF_SPARSE flags.
All of these flags are implemented using attributes that ZFS already supports, so the on-disk format has not changed.
ZFS currently doesn't allow setting the UF_REPARSE flag, and we don't really have the other infrastructure to support reparse points.
msdosfs_denode.c, msdosfs_vnops.c: Add support for getting and setting UF_HIDDEN, UF_SYSTEM and UF_READONLY in MSDOSFS.
It supported SF_ARCHIVED, but this has been changed to be UF_ARCHIVE, which has the same semantics as the DOS archive attribute instead of inverse semantics like SF_ARCHIVED.
After discussion with Bruce Evans, change several things in the msdosfs behavior:
Use UF_READONLY to indicate whether a file is writeable instead of file permissions, but don't actually enforce it.
Refuse to change attributes on the root directory, because it is special in FAT filesystems, but allow most other attribute changes on directories.
Don't set the archive attribute on a directory when its modification time is updated. Windows and DOS don't set the archive attribute in that scenario, so we are now bug-for-bug compatible.
smbfs_node.c, smbfs_vnops.c: Add support for UF_HIDDEN, UF_SYSTEM, UF_READONLY and UF_ARCHIVE in SMBFS.
This is similar to changes that Apple has made in their version of SMBFS (as of smb-583.8, posted on opensource.apple.com), but not quite the same.
We map SMB_FA_READONLY to UF_READONLY, because UF_READONLY is intended to match the semantics of the DOS readonly flag. The MacOS X code maps both UF_IMMUTABLE and SF_IMMUTABLE to SMB_FA_READONLY, but the immutable flags have stronger meaning than the DOS readonly bit.
stat.h: Add definitions for UF_SYSTEM, UF_SPARSE, UF_OFFLINE, UF_REPARSE, UF_ARCHIVE, UF_READONLY and UF_HIDDEN.
The definition of UF_HIDDEN is the same as the MacOS X definition.
Add commented-out definitions of UF_COMPRESSED and UF_TRACKED. They are defined in MacOS X (as of 10.8.2), but we do not implement them (yet).
ufs_vnops.c: Add support for getting and setting UF_ARCHIVE, UF_HIDDEN, UF_OFFLINE, UF_READONLY, UF_REPARSE, UF_SPARSE, and UF_SYSTEM in UFS. Alphabetize the flags that are supported.
These new flags are only stored, UFS does not take any action if the flag is set.
Sponsored by: Spectra Logic Reviewed by: bde (earlier version)
|
254602 |
21-Aug-2013 |
kib |
Make the seek a method of the struct fileops.
Tested by: pho Sponsored by: The FreeBSD Foundation
|
254601 |
21-Aug-2013 |
kib |
Extract the general-purpose code from tmpfs to perform uiomove from the page queue of some vm object.
Discussed with: alc Tested by: pho Sponsored by: The FreeBSD Foundation
|
254415 |
16-Aug-2013 |
kib |
Restore the previous sendfile(2) behaviour on the block devices. Provide valid .fo_sendfile method for several missed struct fileops.
Reviewed by: glebius Sponsored by: The FreeBSD Foundation
|
254337 |
14-Aug-2013 |
rmacklem |
Fix several performance related issues in the new NFS server's DRC for NFS over TCP. - Increase the size of the hash tables. - Create a separate mutex for each hash list of the TCP hash table. - Single thread the code that deletes stale cache entries. - Add a tunable called vfs.nfsd.tcphighwater, which can be increased to allow the cache to grow larger, avoiding the overhead of frequent scans to delete stale cache entries. (The default value will result in frequent scans to delete stale cache entries, analagous to what the pre-patched code does.) - Add a tunable called vfs.nfsd.cachetcp that can be used to disable DRC caching for NFS over TCP, since the old NFS server didn't DRC cache TCP. It also adjusts the size of nfsrc_floodlevel dynamically, so that it is always greater than vfs.nfsd.tcphighwater.
For UDP the algorithm remains the same as the pre-patched code, but the tunable vfs.nfsd.udphighwater can be used to allow the cache to grow larger and reduce the overhead caused by frequent scans for stale entries. UDP also uses a larger hash table size than the pre-patched code.
Reported by: wollman Tested by: wollman (earlier version of patch) Submitted by: ivoras (earlier patch) Reviewed by: jhb (earlier version of patch) MFC after: 1 month
|
254326 |
14-Aug-2013 |
pfg |
ext2fs: update format specifiers for ext4 type.
Previous bandaid was not appropriate and didn't really work for all platforms. While here, cleanup the surrounding code to match ffs_checkoverlap()
Reported by: dim, jmallet and bde MFC after: 3 weeks
|
254286 |
13-Aug-2013 |
pfg |
ext2fs: update format specifiers for ext4 type.
Reported by: Sam Fourman Jr. MFC after: 3 weeks
|
254283 |
13-Aug-2013 |
pfg |
Define ext2fs local types and use them.
Add definitions for e2fs_daddr_t, e4fs_daddr_t in addition to the already existing e2fs_lbn_t and adjust them for ext4. Other than making the code more readable these changes should fix problems related to big filesystems.
Setting the proper types can be tricky so the process was helped by looking at UFS. In our implementation, logical block numbers can be negative and the code depends on it. In ext2, block numbers are unsigned so it is convenient to keep e2fs_daddr_t unsigned and use the complete 32 bits. In the case of e4fs_daddr_t, while the value should be unsigned, for ext4 we only need to support 48 bits so preserving an extra bit from the sign is not an issue.
While here also drop the ext2_setblock() prototype that was never used.
Discussed with: mckusick, bde MFC after: 3 weeks
|
254260 |
12-Aug-2013 |
pfg |
Add read-only support for extents in ext2fs.
Basic support for extents was implemented by Zheng Liu as part of his Google Summer of Code in 2010. This support is read-only at this time.
In addition to extents we also support the huge_file extension for read-only purposes. This works nicely with the additional support for birthtime/nanosec timestamps and dir_index that have been added lately.
The implementation may not work for all ext4 filesystems as it doesn't support some features that are being enabled by default on recent linux like flex_bg. Nevertheless, the feature should be very useful for migration or simple access in filesystems that have been converted from ext2/3 or don't use incompatible features.
Special thanks to Zheng Liu for his dedication and continued work to support ext2 in FreeBSD.
Submitted by: Zheng Liu (lz@) Reviewed by: Mike Ma, Christoph Mallon (previous version) Sponsored by: Google Inc. MFC after: 3 weeks
|
254138 |
09-Aug-2013 |
attilio |
The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it.
Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag
The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code.
Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl
|
254104 |
08-Aug-2013 |
pfg |
Small typo.
MFC after: 3 days
|
253967 |
05-Aug-2013 |
kib |
The tmpfs_alloc_vp() is used to instantiate vnode for the tmpfs node, in particular, from the tmpfs_lookup VOP method. If LK_NOWAIT is not specified in the lkflags, the lookup is supposed to return an alive vnode whenever the underlying node is valid.
Currently, the tmpfs_alloc_vp() returns ENOENT if the vnode attached to node exists and is being reclaimed. This causes spurious ENOENT errors from lookup on tmpfs and corresponding random 'No such file' failures from syscalls working with tmpfs files.
Fix this by waiting for the doomed vnode to be detached from the tmpfs node if sleepable allocation is requested.
Note that filesystems which use vfs_hash.c, correctly handle the case due to vfs_hash_get() looping when vget() returns ENOENT for sleepable requests.
Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
253953 |
05-Aug-2013 |
attilio |
Revert r253939: We cannot busy a page before doing pagefaults. Infact, it can deadlock against vnode lock, as it tries to vget(). Other functions, right now, have an opposite lock ordering, like vm_object_sync(), which acquires the vnode lock first and then sleeps on the busy mechanism.
Before this patch is reinserted we need to break this ordering.
Sponsored by: EMC / Isilon storage division Reported by: kib
|
253939 |
04-Aug-2013 |
attilio |
The page hold mechanism is fast but it has couple of fallouts: - It does not let pages respect the LRU policy - It bloats the active/inactive queues of few pages
Try to avoid it as much as possible with the long-term target to completely remove it. Use the soft-busy mechanism to protect page content accesses during short-term operations (like uiomove_fromphys()).
After this change only vm_fault_quick_hold_pages() is still using the hold mechanism for page content access. There is an additional complexity there as the quick path cannot immediately access the page object to busy the page and the slow path cannot however busy more than one page a time (to avoid deadlocks).
Fixing such primitive can bring to complete removal of the page hold mechanism.
Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff Tested by: pho
|
253927 |
04-Aug-2013 |
attilio |
Remove unnecessary soft busy of the page before to do vn_rdwr() in kern_sendfile() which is unnecessary. The page is already wired so it will not be subjected to pagefault. The content cannot be effectively protected as it is full of races already. Multiple accesses to the same indexes are serialized through vn_rdwr().
Sponsored by: EMC / Isilon storage division Reviewed by: alc, jeff Tested by: pho
|
253861 |
01-Aug-2013 |
pfg |
Add license for the half MD4 algorithm used in ext2_half_md4().
The htree implementation uses code derived from the RSA Data Security, Inc. MD4 Message-Digest Algorithm.
Add a proper licensing statement for the code and clarify the corresponding comments.
Approved by: core (hrs)
|
253742 |
28-Jul-2013 |
marius |
- Add const-qualifiers to the arguments of isonum_*(). - According to ISO 9660 7.1.2, isonum_712() should return a signed value. - Try to get isonum_*() closer to style(9).
|
253677 |
26-Jul-2013 |
avg |
make path matching in devfs rules consistent and sane (and safer)
Before this change path matching had the following features: - for device nodes the patterns were matched against full path - in the above case '/' in a path could be matched by a wildcard - for directories and links only the last component was matched
So, for example, a pattern like 're*' could match the following entries: - re0 device - responder/u0 device - zvol/recpool directory
Although it was possible to work around this behavior (once it was spotted and understood), it was very confusing and contrary to documentation.
Now we always match a full path for all types of devfs entries (devices, directories, links) and a '/' has to be matched explicitly. This behavior follows the shell globbing rules.
This change is originally developed by Jaakko Heinonen. Many thanks!
PR: kern/122838 Submitted by: jh MFC after: 4 weeks
|
253657 |
25-Jul-2013 |
pfg |
ext2fs: Return EINVAL for negative uio_offset as in UFS.
While here drop old comment that doesn't really apply.
MFC after: 1 month Discussed with: gleb
|
253652 |
25-Jul-2013 |
pfg |
ext2fs: Drop a check that wan't supposed to be in r253651.
MFC after: 1 month
|
253651 |
25-Jul-2013 |
pfg |
ext2fs: Don't assume that on-disk format of a directory is the same as in <sys/dirent.h>
ext2_readdir() has always been very fs specific and different with respect to its ufs_ counterpart. Recent changes from UFS have made it possible to share more closely the implementation.
MFUFS r252438: Always start parsing at DIRBLKSIZ aligned offset, skip first entries if uio_offset is not DIRBLKSIZ aligned. Return EINVAL if buffer is too small for single entry.
Preallocate buffer for cookies.
Skip entries with zero inode number.
Reviewed by: gleb, Zheng Liu MFC after: 1 month
|
253619 |
24-Jul-2013 |
pfg |
fuse: revert kernel_header update.
It seems to be causing problems due to the lack of the new features.
Found by: bapt Pointed hat: pfg
|
253573 |
23-Jul-2013 |
nwhitehorn |
tmpfs works perfectly fine with -o union -- there is no reason to exclude it from the list of options.
|
253506 |
20-Jul-2013 |
rmacklem |
The NFSv4 server incorrectly assumed that the high order words of the attribute bitmap argument would be non-zero. This caused an interoperability problem for a recent patch to the Linux NFSv4 client. The Linux folks have changed their patch to avoid this, but this patch fixes the problem on the server.
Reported and tested by: Andre Heider (a.heider@gmail.com) MFC after: 3 days
|
253498 |
20-Jul-2013 |
pfg |
fuse: revert birthtime support.
The creation time support breaks the data structures used in linux fuse. libfuse carries it's own header.
Revert the changes for now. We will try to get an agreement with the fuse upstream maintainers to avoid having to patch the library headers all the time.
|
253479 |
20-Jul-2013 |
pfg |
Adjust outsizes:
Recalculate FUSE_COMPAT_ENTRY_OUT_SIZE and COMPAT_ATTR_OUT_SIZE. These were wrong in the previous commit. They are actually unused in FreeBSD though.
Pointed out by: Jan Beich
|
253478 |
20-Jul-2013 |
pfg |
Adjust outsizes:
When birthtime was added (r253331) we missed adding the weight of the new fields in FUSE_COMPAT_ENTRY_OUT_SIZE and COMPAT_ATTR_OUT_SIZE. Adjust them accordingly.
Pointed out by: Jan Beich
|
253344 |
15-Jul-2013 |
pfg |
Update fuse_kernel header.
Bring in the changes from the FUSE kernel interface 7.10 (available under a BSD license).
After 7.10 the linux FUSE developers added support for a controversial CUSE driver and some linux especific features that are unlikely to find its way into FreeBSD.
We currently don't implement any of the new features so we are *not* bumping the FUSE_KERNEL_MINOR_VERSION. The header should, nevertheless, serve as a template to add the new features in a compatible manner.
While here adopt some minor cleanups from the upstream version like removing FUSE_MAJOR and FUSE_MINOR which were never used. Also add multiple inclusion header guards,
|
253331 |
13-Jul-2013 |
pfg |
Add creation timestamp (birthtime) support for fuse.
I was keeping this #ifdef'd for reference with the MacFUSE change[1] but on second thought, this is a FreeBSD-only header so the SVN history should be enough.
Add missing padding while here.
Reference [1]: http://code.google.com/p/macfuse/source/detail?spec=svn1686&r=1360
|
253276 |
12-Jul-2013 |
pfg |
Add creation timestamp (birthtime) support for fuse.
This is based on similar support in MacFUSE.
|
253173 |
10-Jul-2013 |
pfg |
Implement 1003.1-2001 pathconf() keys.
This is based on r106058 in UFS.
MFC after: 1 month
|
253098 |
09-Jul-2013 |
pfg |
Reinstate the assertion from r253045.
UFS r232732 reverted the change as the real problem was to be fixed at the syscall level.
Reported by: bde
|
253050 |
09-Jul-2013 |
pfg |
Enhancement when writing an entire block of a file.
Merge from UFS r231313:
This change first attempts the uiomove() to the newly allocated (and dirty) buffer and only zeros it if the uiomove() fails. The effect is to eliminate the gratuitous zeroing of the buffer in the usual case where the uiomove() successfully fills it.
MFC after: 3 days
|
253049 |
09-Jul-2013 |
rmacklem |
Add support for host-based (Kerberos 5 service principal) initiator credentials to the kernel rpc. Modify the NFSv4 client to add support for the gssname and allgssname mount options to use this capability. Requires the gssd daemon to be running with the "-h" option.
Reviewed by: jhb
|
253045 |
08-Jul-2013 |
pfg |
Avoid a panic and return EINVAL instead.
Merge from UFS r232692: syscall() fuzzing can trigger this panic.
MFC after: 3 days
|
252956 |
07-Jul-2013 |
pfg |
Implement SEEK_HOLE/SEEK_DATA for ext2fs.
Merged from r236044 on UFS.
MFC after: 3 days
|
252907 |
07-Jul-2013 |
pfg |
Fix some typos.
MFC after: 1 week
|
252890 |
06-Jul-2013 |
pfg |
Initial implementation of the HTree directory index.
This is a port of NetBSD's GSoC 2012 Ext3 HTree directory indexing by Vyacheslav Matyushin. It was cleaned up and enhanced for FreeBSD by Zheng Liu (lz@).
This is an excellent example of work shared among different projects: Vyacheslav was able to look at an early prototype from Zheng Liu who was also able to check the code from Haiku (with permission).
As in linux, the feature is not available by default and must be enabled explicitly with tune2fs. We still do not support the workarounds required in readdir for NFS.
Submitted by: Zheng Liu Tested by: Mike Ma Sponsored by: Google Inc. MFC after: 1 week
|
252714 |
04-Jul-2013 |
kib |
The tvp vnode on rename is usually unlinked. Drop the cached null vnode for tvp to allow the free of the lower vnode, if needed.
PR: kern/180236 Tested by: smh Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
252558 |
03-Jul-2013 |
davide |
- Fix double frees/user after free. - Allocate using smb_rq_alloc() instead of inlining it.
Reported by: uqs Found with: Coverity Scan
|
252528 |
03-Jul-2013 |
rmacklem |
A problem with the old NFS client where large writes to large files would sometimes result in a corrupted file was reported via email. This problem appears to have been caused by r251719 (reverting r251719 fixed the problem). Although I have not been able to reproduce this problem, I suspect it is caused by another thread increasing np->n_size after the mtx_unlock(&np->n_mtx) but before the vnode_pager_setsize() call. Since the np->n_mtx mutex serializes updates to np->n_size, doing the vnode_pager_setsize() with the mutex locked appears to avoid the problem. Unfortunately, vnode_pager_setsize() where the new size is smaller, cannot be called with a mutex held. This patch returns the semantics to be close to pre-r251719 (actually pre-r248567, r248581, r248567 for the new client) such that the call to vnode_pager_setsize() is only delayed until after the mutex is unlocked when np->n_size is shrinking. Since the file is growing when being written, I believe this will fix the corruption. A better solution might be to replace the mutex with a sleep lock, but that is a non-trivial conversion, so this fix is hoped to be sufficient in the meantime.
Reported by: David G. Lawrence (dg@dglawrence.com) Tested by: David G. Lawrence (to be done soon) Reviewed by: kib MFC after: 1 week
|
252397 |
30-Jun-2013 |
pfg |
ext2fs: Use the complete random() range in i_gen.
i_gen is unsigned in ext2fs so we can handle the complete 32 bits.
MFC after: 1 week
|
252364 |
29-Jun-2013 |
pfg |
Bring some updates from ufs_lookup to ext2fs.
r156418:
Don't set IN_CHANGE and IN_UPDATE on inodes for potentially suspended file systems. This could cause deadlocks when creating snapshots. (We can't do snapshots on ext2fs but it is useful to keep things in sync).
r183079:
- Only set i_offset in the parent directory's i-node during a lookup for non-LOOKUP operations. - Relax a VOP assertion for a DELETE lookup.
r187528:
Move the code from ufs_lookup.c used to do dotdot lookup, into the helper function. It is supposed to be useful for any filesystem that has to unlock dvp to walk to the ".." entry in lookup routine.
MFC after: 5 days
|
252355 |
28-Jun-2013 |
davide |
Properly use v_data field. This magically worked (even if wrong) until now because v_data is the first field of the structure, but it's not something we should rely on.
|
252353 |
28-Jun-2013 |
davide |
Garbage collect an useless check. smp should be never NULL.
|
252352 |
28-Jun-2013 |
davide |
Plug a couple of leakages in smbfs_lookup().
|
252259 |
26-Jun-2013 |
pfg |
Minor sorting.
MFC after: 3 days
|
252103 |
23-Jun-2013 |
pfg |
Define and use e2fs_lbn_t in ext2fs.
In line to what is done in UFS, define an internal type e2fs_lbn_t for the logical block numbers.
This change is basically a no-op as the new type is unchanged (int32_t) but it may be useful as bumping this may be required for ext4fs.
Also, as pointed out by Bruce Evans:
-Use daddr_t for daddr in ext2_bmaparray(). This seems to improve reliability with the reallocblks option. - Add a cast to the fsbtodb() macro as in UFS.
Reviewed by: bde MFC after: 3 days
|
252100 |
22-Jun-2013 |
rmacklem |
Fix r252074 so that it builds on 64bit arches.
|
252074 |
21-Jun-2013 |
rmacklem |
The NFSv4.1 LayoutCommit operation requires a valid offset and length. (0, 0 is not sufficient) This patch a loop for each file layout, using the offset, length of each file layout in a separate LayoutCommit.
|
252072 |
21-Jun-2013 |
rmacklem |
When the NFSv4.1 client is writing to a pNFS Data Server (DS), the file's size attribute does not get updated. As such, it is necessary to invalidate the attribute cache before clearing NMODIFIED for pNFS.
MFC after: 2 weeks
|
252067 |
21-Jun-2013 |
rmacklem |
Since some NFSv4 servers enforce the requirement for a reserved port#, enable use of the (no)resvport mount option for NFSv4. I had thought that the RFC required that non-reserved port #s be allowed, but I couldn't find it in the RFC.
MFC after: 2 weeks
|
252012 |
20-Jun-2013 |
pfg |
Rename some prefixes in the Block Group Descriptor fields to ext4bgd_
Change prefix to avoid confusion and denote that these fields are generally only available starting with ext4.
MFC after: 3 days
|
251952 |
18-Jun-2013 |
pfg |
More ext2fs header cleanups:
- Set MAXMNTLEN nearer to where it is used. - Move EXT2_LINK_MAX to ext2_dir.h .
MFC after: 3 days
|
251823 |
17-Jun-2013 |
pfg |
Rename remaining DIAGNOSTIC to INVARIANTS.
MFC after: 3 days
|
251809 |
16-Jun-2013 |
pfg |
Re-sort ext2fs headers to make things easier to find.
In the ext2fs driver we have a mixture of headers:
- The ext2_ prefixed headers have strong influence from NetBSD and are carry specific ext2/3/4 information. - The unprefixed headers are inspired on UFS and carry implementation specific information.
Do some small adjustments so that the information is easier to find coming from either UFS or the NetBSD implementation.
MFC after: 3 days
|
251677 |
13-Jun-2013 |
pfg |
Relax some unnecessary unsigned type changes in ext2fs.
While the changes in r245820 are in line with the ext2 spec, the code derived from UFS can use negative values so it is better to relax some types to keep them as they were, and somewhat more similar to UFS. While here clean some casts.
Some of the original types are still wrong and will require more work.
Discussed with: bde MFC after: 3 days
|
251658 |
12-Jun-2013 |
pfg |
Turn DIAGNOSTICs to INVARIANTS in ext2fs.
This is done to be consistent with what other filesystems and particularly ffs already does (see r173464).
MFC after: 5 days
|
251612 |
11-Jun-2013 |
pfg |
s/file system/filesystem/g
Based on r96755 from UFS.
MFC after: 3 days
|
251562 |
09-Jun-2013 |
pfg |
e2fs_bpg and e2fs_isize are always unsigned.
The superblock in ext2fs defines all the fields as unsigned but for some reason the in-memory superblock was carrying e2fs_bpg and e2fs_isize as signed.
We should preserve the specified types for consistency.
MFC after: 5 days
|
251505 |
07-Jun-2013 |
alc |
Add missing VM object unlocks in an error case.
Reviewed by: kib
|
251452 |
06-Jun-2013 |
alc |
Don't busy the page unless we are likely to release the object lock.
Reviewed by: kib Sponsored by: EMC / Isilon Storage Division
|
251423 |
05-Jun-2013 |
alc |
Relax the vm object locking. Use a read lock.
Sponsored by: EMC / Isilon Storage Division
|
251383 |
04-Jun-2013 |
alc |
Eliminate unnecessary vm object locking from tmpfs_nocacheread().
|
251346 |
03-Jun-2013 |
pfg |
ext2fs: space vs tab.
Obtained from: Christoph Mallon MFC after: 3 days
|
251344 |
03-Jun-2013 |
pfg |
ext2fs: Small cosmetic fixes.
Make a long macro readable and sort a header.
Obtained from: Christoph Mallon MFC after: 3 days
|
251336 |
03-Jun-2013 |
pfg |
ext2fs: Update Block Group Descriptor struct.
Uncover some, previously reserved, fields that are used by Ext4. These are currently unused but it is good to have them for future reference.
Reviewed by: bde MFC after: 3 days
|
251171 |
31-May-2013 |
jeff |
- Convert the bufobj lock to rwlock. - Use a shared bufobj lock in getblk() and inmem(). - Convert softdep's lk to rwlock to match the bufobj lock. - Move INFREECNT to b_flags and protect it with the buf lock. - Remove unnecessary locking around bremfree() and BKGRDINPROG.
Sponsored by: EMC / Isilon Storage Division Discussed with: mckusick, kib, mdf
|
251149 |
30-May-2013 |
kib |
Assert that OBJ_TMPFS flag on the vm object for the tmpfs node is cleared when the tmpfs node is going away.
Tested by: bdrewery, pho
|
251079 |
28-May-2013 |
rmacklem |
Post-r248567, there were times when the client would return a truncated directory for some NFS servers. This turned out to be because the size of a directory reported by an NFS server can be smaller that the ufs-like directory created from the RPC XDR in the client. This patch fixes the problem by changing r248567 so that vnode_pager_setsize() is only done for regular files.
Reported and tested by: hartmut.brandt@dlr.de Reviewed by: kib MFC after: 1 week
|
250852 |
21-May-2013 |
kib |
Do not leak the NULLV_NOUNLOCK flag from the nullfs_unlink_lowervp(), for the case when the nullfs vnode is not reclaimed. Otherwise, later reclamation would not unlock the lower vnode.
Reported by: antoine Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
250657 |
15-May-2013 |
des |
Fix typo in comment.
Submitted by: Alex Weber <alexwebr@gmail.com> MFC after: 1 week
|
250580 |
12-May-2013 |
rmacklem |
Add support for the eofflag to nfs_readdir() in the new NFS client so that it works under a unionfs mount.
Submitted by: Jared Yanovich (slovichon@gmail.com) Reviewed by: kib MFC after: 2 weeks
|
250576 |
12-May-2013 |
eadler |
Fix several typos
PR: kern/176054 Submitted by: Christoph Mallon <christoph.mallon@gmx.de> MFC after: 3 days
|
250567 |
12-May-2013 |
jilles |
fdescfs: Supply a real value for d_type in readdir.
All the fdescfs nodes (except . and ..) appear as character devices to stat(), so DT_CHR is correct.
|
250505 |
11-May-2013 |
kib |
- Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). The null_hashget() obtains the reference on the nullfs vnode, which must be dropped.
- Fix a wart which existed from the introduction of the nullfs caching, do not unlock lower vnode in the nullfs_reclaim_lowervp(). It should be innocent, but now it is also formally safe. Inform the nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on nullfs inode.
- Add a callback to the upper filesystems for the lower vnode unlinking. When inactivating a nullfs vnode, check if the lower vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC on the lower vnode, and reclaim upper vnode if so. This allows nullfs to purge cached vnodes for the unlinked lower vnode, avoiding excessive caching.
Reported by: G??ran L??wkrantz <goran.lowkrantz@ismobile.com> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
250310 |
06-May-2013 |
kib |
Avoid deactivating the page if it is already on a queue, only requeue the page. This both reduces the number of queues locking and avoids moving the active page to inactive list just because the page was read or written.
Based on the suggestion by: alc Reviewed by: alc Tested by: pho
|
250238 |
04-May-2013 |
davide |
Change VM_OBJECT_LOCK/UNLOCK() -> VM_OBJECT_WLOCK/WUNLOCK() to reflect the recent switch of the vm object lock to a rwlock.
Reported by: attilio
|
250237 |
04-May-2013 |
davide |
Overhaul locking in netsmb, getting rid of the obsolete lockmgr() primitive. This solves a long standing LOR between smb_conn and smb_vc.
Tested by: martymac, pho (previous version)
|
250236 |
04-May-2013 |
davide |
Completely rewrite the interface to smbdev switching from dev_clone to cdevpriv(9). This commit changes the semantic of mount_smbfs in userland as well, which now passes file descriptor in order to to mount a specific filesystem istance.
Reviewed by: attilio, ed Tested by: martymac
|
250193 |
02-May-2013 |
kib |
The fsync(2) call should sync the vnode in such way that even after system crash which happen after successfull fsync() return, the data is accessible. For msdosfs, this means that FAT entries for the file must be written.
Since we do not track the FAT blocks containing entries for the current file, just do a sloppy sync of the devvp vnode for the mount, which buffers, among other things, contain FAT blocks.
Simultaneously, for deupdat(): - optimize by clearing the modified flags before short-circuiting a return, if the mount is read-only; - only ignore the rest of the function for denode with DE_MODIFIED flag clear when the waitfor argument is false. The directory buffer for the entry might be of delayed write; - microoptimize by comparing the updated directory entry with the current block content; - try to cluster the write, fall back to bawrite() if low on resources.
Based on the submission by: bde MFC after: 2 weeks
|
250190 |
02-May-2013 |
kib |
Fix the v_object leak for non-regular tmpfs vnodes.
Reported and tested by: pho Sponsored by: The FreeBSD Foundation
|
250189 |
02-May-2013 |
kib |
For the new regular tmpfs vnode, v_object is initialized before insmntque() is called. The standard insmntque destructor resets the vop vector to deadfs one, and calls vgone() on the vnode. As result, v_object is kept unchanged, which triggers an assertion in the reclaim code, on instmntque() failure. Also, in this case, OBJ_TMPFS flag on the backed vm object is not cleared.
Provide the tmpfs insmntque() destructor which properly clears OBJ_TMPFS flag and resets v_object.
Reported and tested by: pho Sponsored by: The FreeBSD Foundation
|
250188 |
02-May-2013 |
kib |
The page read or written could be wired. Do not requeue if the page is not on a queue.
Reported and tested by: pho Sponsored by: The FreeBSD Foundation
|
250055 |
29-Apr-2013 |
des |
Fix a bug that allows NFS clients to issue READDIR on files.
PR: kern/178016 Security: CVE-2013-3266 Security: FreeBSD-SA-13:05.nfsserver
|
250030 |
28-Apr-2013 |
kib |
Rework the handling of the tmpfs node backing swap object and tmpfs vnode v_object to avoid double-buffering. Use the same object both as the backing store for tmpfs node and as the v_object.
Besides reducing memory use up to 2x times for situation of mapping files from tmpfs, it also makes tmpfs read and write operations copy twice bytes less.
VM subsystem was already slightly adapted to tolerate OBJT_SWAP object as v_object. Now the vm_object_deallocate() is modified to not reinstantiate OBJ_ONEMAPPING flag and help the VFS to correctly handle VV_TEXT flag on the last dereference of the tmpfs backing object.
Reviewed by: alc Tested by: pho, bf MFC after: 1 month
|
249630 |
18-Apr-2013 |
rmacklem |
When an NFS unmount occurs, once vflush() writes the last dirty buffer for the last vnode on the mount back to the server, it returns. At that point, the code continues with the unmount, including freeing up the nfs specific part of the mount structure. It is possible that an nfsiod thread will try to check for an empty I/O queue in the nfs specific part of the mount structure after it has been free'd by the unmount. This patch avoids this problem by setting the iodmount entries for the mount back to NULL while holding the mutex in the unmount and checking the appropriate entry is non-NULL after acquiring the mutex in the nfsiod thread.
Reported and tested by: pho Reviewed by: kib MFC after: 2 weeks
|
249623 |
18-Apr-2013 |
rmacklem |
Both NFS clients can deadlock when using the "rdirplus" mount option. This can occur when an nfsiod thread that already holds a buffer lock attempts to acquire a vnode lock on an entry in the directory (a LOR) when another thread holding the vnode lock is waiting on an nfsiod thread. This patch avoids the deadlock by disabling readahead for this case, so the nfsiod threads never do readdirplus. Since readaheads for directories need the directory offset cookie from the previous read, they cannot normally happen in parallel. As such, testing by jhb@ and myself didn't find any performance degredation when this patch is applied. If there is a case where this results in a significant performance degradation, mounting without the "rdirplus" option can be done to re-enable readahead for directories.
Reported and tested by: jhb Reviewed by: jhb MFC after: 2 weeks
|
249596 |
17-Apr-2013 |
ken |
Move the NFS FHA (File Handle Affinity) code from sys/nfsserver to sys/nfs, since it is now shared by the two NFS servers.
Suggested by: rmacklem Sponsored by: Spectra Logic MFC after: 2 weeks
|
249592 |
17-Apr-2013 |
ken |
Revamp the old NFS server's File Handle Affinity (FHA) code so that it will work with either the old or new server.
The FHA code keeps a cache of currently active file handles for NFSv2 and v3 requests, so that read and write requests for the same file are directed to the same group of threads (reads) or thread (writes). It does not currently work for NFSv4 requests. They are more complex, and will take more work to support.
This improves read-ahead performance, especially with ZFS, if the FHA tuning parameters are configured appropriately. Without the FHA code, concurrent reads that are part of a sequential read from a file will be directed to separate NFS threads. This has the effect of confusing the ZFS zfetch (prefetch) code and makes sequential reads significantly slower with clients like Linux that do a lot of prefetching.
The FHA code has also been updated to direct write requests to nearby file offsets to the same thread in the same way it batches reads, and the FHA code will now also send writes to multiple threads when needed.
This improves sequential write performance in ZFS, because writes to a file are now more ordered. Since NFS writes (generally less than 64K) are smaller than the typical ZFS record size (usually 128K), out of order NFS writes to the same block can trigger a read in ZFS. Sending them down the same thread increases the odds of their being in order.
In order for multiple write threads per file in the FHA code to be useful, writes in the NFS server have been changed to use a LK_SHARED vnode lock, and upgrade that to LK_EXCLUSIVE if the filesystem doesn't allow multiple writers to a file at once. ZFS is currently the only filesystem that allows multiple writers to a file, because it has internal file range locking. This change does not affect the NFSv4 code.
This improves random write performance to a single file in ZFS, since we can now have multiple writers inside ZFS at one time.
I have changed the default tuning parameters to a 22 bit (4MB) window size (from 256K) and unlimited commands per thread as a result of my benchmarking with ZFS.
The FHA code has been updated to allow configuring the tuning parameters from loader tunable variables in addition to sysctl variables. The read offset window calculation has been slightly modified as well. Instead of having separate bins, each file handle has a rolling window of bin_shift size. This minimizes glitches in throughput when shifting from one bin to another.
sys/conf/files: Add nfs_fha_new.c and nfs_fha_old.c. Compile nfs_fha.c when either the old or the new NFS server is built.
sys/fs/nfs/nfsport.h, sys/fs/nfs/nfs_commonport.c: Bring in changes from Rick Macklem to newnfs_realign that allow it to operate in blocking (M_WAITOK) or non-blocking (M_NOWAIT) mode.
sys/fs/nfs/nfs_commonsubs.c, sys/fs/nfs/nfs_var.h: Bring in a change from Rick Macklem to allow telling nfsm_dissect() whether or not to wait for mallocs.
sys/fs/nfs/nfsm_subs.h: Bring in changes from Rick Macklem to create a new nfsm_dissect_nonblock() inline function and NFSM_DISSECT_NONBLOCK() macro.
sys/fs/nfs/nfs_commonkrpc.c, sys/fs/nfsclient/nfs_clkrpc.c: Add the malloc wait flag to a newnfs_realign() call.
sys/fs/nfsserver/nfs_nfsdkrpc.c: Setup the new NFS server's RPC thread pool so that it will call the FHA code.
Add the malloc flag argument to newnfs_realign().
Unstaticize newnfs_nfsv3_procid[] so that we can use it in the FHA code.
sys/fs/nfsserver/nfs_nfsdsocket.c: In nfsrvd_dorpc(), add NFSPROC_WRITE to the list of RPC types that use the LK_SHARED lock type.
sys/fs/nfsserver/nfs_nfsdport.c: In nfsd_fhtovp(), if we're starting a write, check to see whether the underlying filesystem supports shared writes. If not, upgrade the lock type from LK_SHARED to LK_EXCLUSIVE.
sys/nfsserver/nfs_fha.c: Remove all code that is specific to the NFS server implementation. Anything that is server-specific is now accessed through a callback supplied by that server's FHA shim in the new softc.
There are now separate sysctls and tunables for the FHA implementations for the old and new NFS servers. The new NFS server has its tunables under vfs.nfsd.fha, the old NFS server's tunables are under vfs.nfsrv.fha as before.
In fha_extract_info(), use callouts for all server-specific code. Getting file handles and offsets is now done in the individual server's shim module.
In fha_hash_entry_choose_thread(), change the way we decide whether two reads are in proximity to each other. Previously, the calculation was a simple shift operation to see whether the offsets were in the same power of 2 bucket. The issue was that there would be a bucket (and therefore thread) transition, even if the reads were in close proximity. When there is a thread transition, reads wind up going somewhat out of order, and ZFS gets confused.
The new calculation simply tries to see whether the offsets are within 1 << bin_shift of each other. If they are, the reads will be sent to the same thread.
The effect of this change is that for sequential reads, if the client doesn't exceed the max_reqs_per_nfsd parameter and the bin_shift is set to a reasonable value (22, or 4MB works well in my tests), the reads in any sequential stream will largely be confined to a single thread.
Change fha_assign() so that it takes a softc argument. It is now called from the individual server's shim code, which will pass in the softc.
Change fhe_stats_sysctl() so that it takes a softc parameter. It is now called from the individual server's shim code. Add the current offset to the list of things printed out about each active thread.
Change the num_reads and num_writes counters in the fha_hash_entry structure to 32-bit values, and rename them num_rw and num_exclusive, respectively, to reflect their changed usage.
Add an enable sysctl and tunable that allows the user to disable the FHA code (when vfs.XXX.fha.enable = 0). This is useful for before/after performance comparisons.
nfs_fha.h: Move most structure definitions out of nfs_fha.c and into the header file, so that the individual server shims can see them.
Change the default bin_shift to 22 (4MB) instead of 18 (256K). Allow unlimited commands per thread.
sys/nfsserver/nfs_fha_old.c, sys/nfsserver/nfs_fha_old.h, sys/fs/nfsserver/nfs_fha_new.c, sys/fs/nfsserver/nfs_fha_new.h: Add shims for the old and new NFS servers to interface with the FHA code, and callbacks for the
The shims contain all of the code and definitions that are specific to the NFS servers.
They setup the server-specific callbacks and set the server name for the sysctl and loader tunable variables.
sys/nfsserver/nfs_srvkrpc.c: Configure the RPC code to call fhaold_assign() instead of fha_assign().
sys/modules/nfsd/Makefile: Add nfs_fha.c and nfs_fha_new.c.
sys/modules/nfsserver/Makefile: Add nfs_fha_old.c.
Reviewed by: rmacklem Sponsored by: Spectra Logic MFC after: 2 weeks
|
249588 |
17-Apr-2013 |
gabor |
- Correct spelling in comments
Submitted by: Christoph Mallon <christoph.mallon@gmx.de> (via private mail)
|
249583 |
17-Apr-2013 |
gabor |
- Correct mispellings of the word necessary
Submitted by: Christoph Mallon <christoph.mallon@gmx.de> (via private mail)
|
249218 |
06-Apr-2013 |
jeff |
Prepare to replace the buf splay with a trie:
- Don't insert BKGRDMARKER bufs into the splay or dirty/clean buf lists. No consumers need to find them there and it complicates the tree. These flags are all FFS specific and could be moved out of the buf cache. - Use pbgetvp() and pbrelvp() to associate the background and journal bufs with the vp. Not only is this much cheaper it makes more sense for these transient bufs. - Fix the assertions in pbget* and pbrel*. It's not safe to check list pointers which were never initialized. Use the BX flags instead. We also check B_PAGING in reassignbuf() so this should cover all cases.
Discussed with: kib, mckusick, attilio Sponsored by: EMC / Isilon Storage Division
|
248967 |
01-Apr-2013 |
kib |
Strip the unnneeded spaces, mostly at the end of lines.
MFC after: 3 days
|
248610 |
22-Mar-2013 |
pjd |
- Constify local path variable for chflagsat(). - Use correct format characters (%lx) for u_long.
This fixes the build broken in r248599.
|
248597 |
21-Mar-2013 |
pjd |
- Make 'flags' argument to chflags(2), fchflags(2) and lchflags(2) of type u_long. Before this change it was of type int for syscalls, but prototypes in sys/stat.h and documentation for chflags(2) and fchflags(2) (but not for lchflags(2)) stated that it was u_long. Now some related functions use u_long type for flags (strtofflags(3), fflagstostr(3)). - Make path argument of type 'const char *' for consistency.
Discussed on: arch Sponsored by: The FreeBSD Foundation
|
248581 |
21-Mar-2013 |
kib |
Initialize the variable to avoid (false) compiler warning about use of an uninitialized local.
Reported by: Ivan Klymenko <fidaj@ukr.net> MFC after: 2 weeks
|
248567 |
21-Mar-2013 |
kib |
Do not call vnode_pager_setsize() while a NFS node mutex is locked. vnode_pager_setsize() might sleep waiting for the page after EOF be unbusied.
Call vnode_pager_setsize() both for the regular and directory vnodes.
Reported by: mich Reviewed by: rmacklem Discussed with: avg, jhb MFC after: 2 weeks
|
248500 |
19-Mar-2013 |
emaste |
Fix remainder calculation when biosize is not a power of 2
In common configurations biosize is a power of two, but is not required to be so. Thanks to markj@ for spotting an additional case beyond my original patch.
Reviewed by: rmacklem@
|
248422 |
17-Mar-2013 |
kib |
Remove negative name cache entry pointing to the target name, which could be instantiated while tdvp was unlocked.
Reported by: Rick Miller <vmiller at hostileadmin com> Tested by: pho MFC after: 1 week
|
248282 |
14-Mar-2013 |
kib |
Add currently unused flag argument to the cluster_read(), cluster_write() and cluster_wbuild() functions. The flags to be allowed are a subset of the GB_* flags for getblk().
Sponsored by: The FreeBSD Foundation Tested by: pho
|
248255 |
13-Mar-2013 |
jhb |
Revert 195703 and 195821 as this special stop handling in NFS is now implemented via VFCF_SBDRY rather than passing PBDRY to individual sleep calls.
|
248188 |
12-Mar-2013 |
glebius |
Finish r243882: mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys.
Sponsored by: Nginx, Inc.
|
248101 |
09-Mar-2013 |
davide |
smbfs_lookup() in the DOTDOT case operates on dvp->n_parent without proper locking. This doesn't prevent in any case reclaim of the vnode. Avoid this not going over-the-wire in this case and relying on subsequent smbfs_getattr() call to restore consistency. While I'm here, change a couple of SMBVDEBUG() in MPASS(). sbmfs_smb_lookup() doesn't and shouldn't know about '.' and '..'
Reported by: pho's stress2 suite
|
248099 |
09-Mar-2013 |
davide |
- Initialize variable in smbfs_rename() to silent compiler warning - Fix smbfs_mkdir() return value (in case of error).
Reported by: pho
|
248097 |
09-Mar-2013 |
attilio |
Garbage collect NWFS and NCP bits which are now completely disconnected from the tree since few months.
This patch is not targeted for MFC.
|
248084 |
09-Mar-2013 |
attilio |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes.
The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs.
The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example).
Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
247665 |
02-Mar-2013 |
attilio |
Garbage collect NTFS bits which are now completely disconnected from the tree since few months.
This patch is not targeted for MFC.
|
247640 |
02-Mar-2013 |
attilio |
Garbage collect PORTALFS bits which are now completely disconnected from the tree since few months.
This patch is not targeted for MFC.
|
247635 |
02-Mar-2013 |
attilio |
Garbage collect CODAFS bits which are now completely disconnected from the tree since few months.
This patch is not targeted for MFC.
|
247628 |
02-Mar-2013 |
attilio |
Garbage collect HPFS bits which are now already completely disconnected from the tree since few months (please note that the userland bits were already disconnected since a long time, thus there is no need to update the OLD* entries).
This is not targeted for MFC.
|
247619 |
02-Mar-2013 |
jilles |
nullfs: Improve f_flags in statfs().
Include some flags of the nullfs mount itself: MNT_RDONLY, MNT_NOEXEC, MNT_NOSUID, MNT_UNION, MNT_NOSYMFOLLOW.
This allows userland code calling statfs() or fstatfs() to see these flags. In particular, this allows opendir() to detect that a -t nullfs -o union mount needs deduplication (otherwise at least . and .. are returned twice) and allows rtld to detect a -t nullfs -o noexec mount as noexec.
Turn off the MNT_ROOTFS flag from the underlying filesystem because the nullfs mount is definitely not the root filesystem.
Reviewed by: kib MFC after: 1 week
|
247602 |
02-Mar-2013 |
pjd |
Merge Capsicum overhaul:
- Capability is no longer separate descriptor type. Now every descriptor has set of its own capability rights.
- The cap_new(2) system call is left, but it is no longer documented and should not be used in new code.
- The new syscall cap_rights_limit(2) should be used instead of cap_new(2), which limits capability rights of the given descriptor without creating a new one.
- The cap_getrights(2) syscall is renamed to cap_rights_get(2).
- If CAP_IOCTL capability right is present we can further reduce allowed ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed ioctls can be retrived with cap_ioctls_get(2) syscall.
- If CAP_FCNTL capability right is present we can further reduce fcntls that can be used with the new cap_fcntls_limit(2) syscall and retrive them with cap_fcntls_get(2).
- To support ioctl and fcntl white-listing the filedesc structure was heavly modified.
- The audit subsystem, kdump and procstat tools were updated to recognize new syscalls.
- Capability rights were revised and eventhough I tried hard to provide backward API and ABI compatibility there are some incompatible changes that are described in detail below:
CAP_CREATE old behaviour: - Allow for openat(2)+O_CREAT. - Allow for linkat(2). - Allow for symlinkat(2). CAP_CREATE new behaviour: - Allow for openat(2)+O_CREAT.
Added CAP_LINKAT: - Allow for linkat(2). ABI: Reuses CAP_RMDIR bit. - Allow to be target for renameat(2).
Added CAP_SYMLINKAT: - Allow for symlinkat(2).
Removed CAP_DELETE. Old behaviour: - Allow for unlinkat(2) when removing non-directory object. - Allow to be source for renameat(2).
Removed CAP_RMDIR. Old behaviour: - Allow for unlinkat(2) when removing directory.
Added CAP_RENAMEAT: - Required for source directory for the renameat(2) syscall.
Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR): - Allow for unlinkat(2) on any object. - Required if target of renameat(2) exists and will be removed by this call.
Removed CAP_MAPEXEC.
CAP_MMAP old behaviour: - Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and PROT_WRITE. CAP_MMAP new behaviour: - Allow for mmap(2)+PROT_NONE.
Added CAP_MMAP_R: - Allow for mmap(PROT_READ). Added CAP_MMAP_W: - Allow for mmap(PROT_WRITE). Added CAP_MMAP_X: - Allow for mmap(PROT_EXEC). Added CAP_MMAP_RW: - Allow for mmap(PROT_READ | PROT_WRITE). Added CAP_MMAP_RX: - Allow for mmap(PROT_READ | PROT_EXEC). Added CAP_MMAP_WX: - Allow for mmap(PROT_WRITE | PROT_EXEC). Added CAP_MMAP_RWX: - Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC).
Renamed CAP_MKDIR to CAP_MKDIRAT. Renamed CAP_MKFIFO to CAP_MKFIFOAT. Renamed CAP_MKNODE to CAP_MKNODEAT.
CAP_READ old behaviour: - Allow pread(2). - Disallow read(2), readv(2) (if there is no CAP_SEEK). CAP_READ new behaviour: - Allow read(2), readv(2). - Disallow pread(2) (CAP_SEEK was also required).
CAP_WRITE old behaviour: - Allow pwrite(2). - Disallow write(2), writev(2) (if there is no CAP_SEEK). CAP_WRITE new behaviour: - Allow write(2), writev(2). - Disallow pwrite(2) (CAP_SEEK was also required).
Added convinient defines:
#define CAP_PREAD (CAP_SEEK | CAP_READ) #define CAP_PWRITE (CAP_SEEK | CAP_WRITE) #define CAP_MMAP_R (CAP_MMAP | CAP_SEEK | CAP_READ) #define CAP_MMAP_W (CAP_MMAP | CAP_SEEK | CAP_WRITE) #define CAP_MMAP_X (CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL) #define CAP_MMAP_RW (CAP_MMAP_R | CAP_MMAP_W) #define CAP_MMAP_RX (CAP_MMAP_R | CAP_MMAP_X) #define CAP_MMAP_WX (CAP_MMAP_W | CAP_MMAP_X) #define CAP_MMAP_RWX (CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X) #define CAP_RECV CAP_READ #define CAP_SEND CAP_WRITE
#define CAP_SOCK_CLIENT \ (CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \ CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN) #define CAP_SOCK_SERVER \ (CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \ CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \ CAP_SETSOCKOPT | CAP_SHUTDOWN)
Added defines for backward API compatibility:
#define CAP_MAPEXEC CAP_MMAP_X #define CAP_DELETE CAP_UNLINKAT #define CAP_MKDIR CAP_MKDIRAT #define CAP_RMDIR CAP_UNLINKAT #define CAP_MKFIFO CAP_MKFIFOAT #define CAP_MKNOD CAP_MKNODAT #define CAP_SOCK_ALL (CAP_SOCK_CLIENT | CAP_SOCK_SERVER)
Sponsored by: The FreeBSD Foundation Reviewed by: Christoph Mallon <christoph.mallon@gmx.de> Many aspects discussed with: rwatson, benl, jonathan ABI compatibility discussed with: kib
|
247312 |
26-Feb-2013 |
alc |
Eliminate a duplicate #include.
Sponsored by: EMC / Isilon Storage Division
|
247297 |
26-Feb-2013 |
attilio |
Merge from vmobj-rwlock branch: Remove unused inclusion of vm/vm_pager.h and vm/vnode_pager.h.
Sponsored by: EMC / Isilon storage division Tested by: pho Reviewed by: alc
|
247116 |
21-Feb-2013 |
jhb |
Further refine the handling of stop signals in the NFS client. The changes in r246417 were incomplete as they did not add explicit calls to sigdeferstop() around all the places that previously passed SBDRY to _sleep(). In addition, nfs_getcacheblk() could trigger a write RPC from getblk() resulting in sigdeferstop() recursing. Rather than manually deferring stop signals in specific places, change the VFS_*() and VOP_*() methods to defer stop signals for filesystems which request this behavior via a new VFCF_SBDRY flag. Note that this has to be a VFC flag rather than a MNTK flag so that it works properly with VFS_MOUNT() when the mount is not yet fully constructed. For now, only the NFS clients are set this new flag in VFS_SET().
A few other related changes: - Add an assertion to ensure that TDF_SBDRY doesn't leak to userland. - When a lookup request uses VOP_READLINK() to follow a symlink, mark the request as being on behalf of the thread performing the lookup (cnp_thread) rather than using a NULL thread pointer. This causes NFS to properly handle signals during this VOP on an interruptible mount.
PR: kern/176179 Reported by: Russell Cattelan (sigdeferstop() recursion) Reviewed by: kib MFC after: 1 month
|
247072 |
21-Feb-2013 |
imp |
The request queue is already locked, so we don't need the splsofclock/splx here to note future work.
|
246921 |
17-Feb-2013 |
kib |
Do not update the fsinfo block on each update of any fat block, this is excessive. Postpone the flush of the fsinfo to VFS_SYNC(), remembering the need for update with the flag MSDOSFS_FSIMOD, stored in pm_flags.
FAT32 specification describes both FSI_Free_Count and FSI_Nxt_Free as the advisory hints, not requiring them to be correct.
Based on the patch from bde, modified by me.
Reviewed by: bde MFC after: 2 weeks
|
246793 |
14-Feb-2013 |
bapt |
Revert r246791 as it needs a security review first
Reported by: gavin, rwatson
|
246791 |
14-Feb-2013 |
bapt |
Allow fdescfs to be mounted from inside a jail
MFC after: 1 week
|
246634 |
10-Feb-2013 |
pfg |
ext2fs: Use prototype declarations for function definitions
Submitted by: Christoph Mallon MFC after: 2 weeks
|
246603 |
10-Feb-2013 |
attilio |
Remove a racy checks on resident and cached pages for tmpfs_mapped{read, write}() functions: - tmpfs_mapped{read, write}() are only called within VOP_{READ, WRITE}(), which check before-hand to work only on valid VREG vnodes. Also the vnode is locked for the duration of the work, making vnode reclaiming impossible, during the operation. Hence, vobj can never be NULL. - Currently check on resident pages and cached pages without vm object lock held is racy and can do even more harm than good, as a page could be transitioning between these 2 pools and then be skipped entirely. Skip the checks as lookups on empty splay trees are very cheap.
Discussed with: alc Tested by: flo MFC after: 2 weeks
|
246564 |
08-Feb-2013 |
pfg |
ext2fs: Replace redundant EXT2_MIN_BLOCK with EXT2_MIN_BLOCK_SIZE.
Submitted by: Christoph Mallon MFC after: 2 weeks
|
246563 |
08-Feb-2013 |
pfg |
ext2fs: make e2fs_maxcontig local and remove tautological check.
e2fs_maxcontig was modelled after UFS when bringing the "Orlov allocator" to ext2. On UFS fs_maxcontig is kept in the superblock and is used by userland tools (fsck and growfs),
In ext2 this information is volatile so it is not available for userland tools, so in this case it doesn't have sense to carry it in the in-memory superblock.
Also remove a pointless check for MAX(1, x) > 0.
Submitted by: Christoph Mallon MFC after: 2 weeks
|
246562 |
08-Feb-2013 |
pfg |
Remove unused MAXSYMLINKLEN macro.
Reviewed by: mckusick PR: kern/175794 MFC after: 1 week
|
246472 |
07-Feb-2013 |
kib |
Stop translating the ERESTART error from the open(2) into EINTR. Posix requires that open(2) is restartable for SA_RESTART.
For non-posix objects, in particular, devfs nodes, still disable automatic restart of the opens. The open call to a driver could have significant side effects for the hardware.
Noted and reviewed by: jilles Discussed with: bde MFC after: 2 weeks
|
246417 |
06-Feb-2013 |
jhb |
Rework the handling of stop signals in the NFS client. The changes in 195702, 195703, and 195821 prevented a thread from suspending while holding locks inside of NFS by forcing the thread to fail sleeps with EINTR or ERESTART but defer the thread suspension to the user boundary. However, this had the effect that stopping a process during an NFS request could abort the request and trigger EINTR errors that were visible to userland processes (previously the thread would have suspended and completed the request once it was resumed).
This change instead effectively masks stop signals while in the NFS client. It uses the existing TDF_SBDRY flag to effect this since SIGSTOP cannot be masked directly. Also, instead of setting PBDRY on individual sleeps, the NFS client now sets the TDF_SBDRY flag around each NFS request and stop signals are masked for all sleeps during that region (the previous change missed sleeps in lockmgr locks). The end result is that stop signals sent to threads performing an NFS request are completely ignored until after the NFS request has finished processing and the thread prepares to return to userland. This restores the behavior of stop signals being transparent to userland processes while still preventing threads from suspending while holding NFS locks.
Reviewed by: kib MFC after: 1 month
|
246352 |
05-Feb-2013 |
pfg |
ext2fs: move assignment where it is not dead.
Submitted by: Christoph Mallon MFC after: 2 weeks
|
246351 |
05-Feb-2013 |
pfg |
ext2fs: Remove unused em_e2fsb definition..
Submitted by: Christoph Mallon MFC after: 2 weeks
|
246350 |
05-Feb-2013 |
pfg |
ext2fs: Remove useless rootino local variable.
Submitted by: Christoph Mallon MFC after: 2 weeks
|
246349 |
05-Feb-2013 |
pfg |
ext2fs: Correct off-by-one errors in FFTODT() and DDTOFT().
Submitted by: Christoph Mallon MFC after: 2 weeks
|
246348 |
05-Feb-2013 |
pfg |
ext2fs: Use nitems().
Submitted by: Christoph Mallon MFC after: 2 weeks
|
246347 |
05-Feb-2013 |
pfg |
ext2fs: Use EXT2_LINK_MAX instead of LINK_MAX
Submitted by: Christoph Mallon MFC after: 2 weeks
|
246258 |
02-Feb-2013 |
pfg |
ext2fs: general cleanup.
- Remove unused extern declarations in fs.h - Correct comments in ext2_dir.h - Several panic() messages showed wrong function names. - Remove commented out stray line in ext2_alloc.c. - Remove the unused macro EXT2_BLOCK_SIZE_BITS() and the then write-only member e2fs_blocksize_bits from struct m_ext2fs. - Remove the unused macro EXT2_FIRST_INO() and the then write-only member e2fs_first_inode from struct m_ext2fs. - Remove EXT2_DESC_PER_BLOCK() and the member e2fs_descpb from struct m_ext2fs. - Remove the unused members e2fs_bmask, e2fs_dbpg and e2fs_mount_opt from struct m_ext2fs - Correct harmless off-by-one error for fspath in ext2_vfsops.c. - Remove the unused and broken macros EXT2_ADDR_PER_BLOCK_BITS() and EXT2_DESC_PER_BLOCK_BITS(). - Remove the !_KERNEL versions of the EXT2_* macros.
Submitted by: Christoph Mallon MFC after: 2 weeks
|
246219 |
01-Feb-2013 |
kib |
The MSDOSFSMNT_WAITONFAT flag is bogus and broken. It does less than track the MNT_SYNCHRONOUS flag. It is set to the latter at mount time but not updated by MNT_UPDATE.
Use MNT_SYNCHRONOUS to decide to write the FAT updates syncrhonously.
Submitted by: bde MFC after: 1 week
|
246218 |
01-Feb-2013 |
kib |
Backup FATs were sometimes marked dirty by copying their first block from the primary FAT, and then they were not marked clean on unmount. Force marking them clean when appropriate.
Submitted by: bde MFC after: 1 week
|
246217 |
01-Feb-2013 |
kib |
The directory entry for dotdot was corrupted in the FAT32 case when moving a directory to a subdir of the root directory from somewhere else.
For all directory moves that change the parent directory, the dotdot entry must be fixed up. For msdosfs, the root directory is magic for non-FAT32. It is less magic for FAT32, but needs the same magic for the dotdot fixup. It didn't have it.
Both chkdsk and fsck_msdosfs fix the corrupt directory entries with no problems.
The fix is to use the same magic for dotdot in msdosfs_rename() as in msdosfs_mkdir().
For msdosfs_mkdir(), document the magic. When writing the dotdot entry in mkdir, use explicitly set pcl variable instead on relying on the start cluster of the root directory typically has a value < 65536.
Submitted by: bde MFC after: 1 week
|
246216 |
01-Feb-2013 |
kib |
The mountmsdosfs() function had an insane sanity test, remove it.
Trying FAT32 on a small partition failed to mount because pmp->pm_Sectors was nonzero. Normally, FAT32 file systems are so large that the 16-bit pm_Sectors can't hold the size. This is indicated by setting it to 0 and using only pm_HugeSectors. But at least old versions of newfs_msdos use the 16-bit field if possible, and msdosfs supports this except for breaking its own support in the sanity check. This is quite different from the handling of pm_FATsecs -- now the 16-bit value is always ignored for FAT32 except for checking that it is 0, and newfs_msdos doesn't use the 16-bit value for FAT32.
Submitted by: bde MFC after: 1 week
|
246215 |
01-Feb-2013 |
kib |
Fix a backwards comment in markvoldirty().
Submitted by: bde MFC after: 1 week
|
246213 |
01-Feb-2013 |
kib |
Assert that the mbuf in the chain has sane length. Proper place for this check is somewhere in the network code, but this assertion already proven to be useful in catching what seems to be driver bugs causing NFS scrambling random memory.
Discussed with: rmacklem MFC after: 1 week
|
245977 |
27-Jan-2013 |
kib |
Be conservative and do not try to consume more bytes than was requested from the server for the read operation. Server shall not reply with too large size, but client should be resilent too.
Reviewed by: rmacklem MFC after: 1 week
|
245952 |
26-Jan-2013 |
pfg |
Clean some 'svn:executable' properties in the tree.
Submitted by: Christoph Mallon MFC after: 3 days
|
245950 |
26-Jan-2013 |
pfg |
Cosmetical off-by-one
Technically, the case when all the blocks are released is not a sanity check. Move further the comment while here.
Suggested by: bde MFC after: 3 days
|
245909 |
25-Jan-2013 |
jhb |
Further cleanups to use of timestamps in NFS: - Use NFSD_MONOSEC (which maps to time_uptime) instead of the seconds portion of wall-time stamps to manage timeouts on events. - Remove unused nd_starttime from the per-request structure in the new NFS server. - Use nanotime() for the modification time on a delegation to get as precise a time as possible. - Use time_second instead of extracting the second from a call to getmicrotime().
Submitted by: bde (3) Reviewed by: bde, rmacklem MFC after: 2 weeks
|
245844 |
23-Jan-2013 |
pfg |
ext2fs: fix a check for negative block numbers.
The previous change accidentally left the substraction we were trying to avoid in case that i_blocks could become negative.
Reported by: bde MFC after: 4 days
|
245820 |
22-Jan-2013 |
pfg |
ext2fs: make some inode fields match the ext2 spec.
Ext2fs uses unsigned fields in its dinode struct. FreeBSD can have negative values in some of those fields and the inode is meant to interact with the system so we have never respected the unsigned nature of most of those fields.
Block numbers and the NFS generation number do not need to be signed so redefine them as unsigned to better match the on-disk information.
MFC after: 1 week
|
245817 |
22-Jan-2013 |
pfg |
ext2fs: temporarily disable the reallocation code.
Testing with fsx has revealed problems and in order to hunt the bugs properly we need reduce the complexity.
This seems to help but is not a complete solution.
MFC after: 3 days
|
245613 |
18-Jan-2013 |
delphij |
Make it possible to force async at server side on new NFS server, similar to the old one's nfs.nfsrv.async.
Please note that by enabling this option (default is disabled), the system could potentionally have silent data corruption if the server crashes before write is committed to non-volatile storage, as the client side have no way to tell if the data is already written.
Submitted by: rmacklem MFC after: 2 weeks
|
245612 |
18-Jan-2013 |
pfg |
ext2fs: Add some DOINGASYNC check to match ffs.
This is mostly cosmetical.
Reviewed by: bde MFC after: 3 days
|
245611 |
18-Jan-2013 |
jhb |
Use vfs_timestamp() to set file timestamps rather than invoking getmicrotime() or getnanotime() directly in NFS.
Reviewed by: rmacklem, bde MFC after: 1 week
|
245566 |
17-Jan-2013 |
jhb |
Remove a no-longer-used variable after the previous change to use VA_UTIMES_NULL.
Submitted by: bde, rmacklem MFC after: 1 week
|
245508 |
16-Jan-2013 |
jhb |
Use the VA_UTIMES_NULL flag to detect when NULL was passed to utimes() instead of comparing the desired time against the current time as a heuristic.
Reviewed by: rmacklem MFC after: 1 week
|
245495 |
16-Jan-2013 |
kib |
Remove the filtering of the acceptable mount options for nullfs, added in r245004. Although the report was for noatime option which is non-functional for the nullfs, other standard options like nosuid or noexec are useful with it.
Reported by: Dewayne Geraghty <dewayne.geraghty@heuristicsystems.com.au> MFC after: 3 days
|
245476 |
15-Jan-2013 |
jhb |
- More properly handle interrupted NFS requests on an interruptible mount by returning an error of EINTR rather than EACCES. - While here, bring back some (but not all) of the NFS RPC statistics lost when krpc was committed.
Reviewed by: rmacklem MFC after: 1 week
|
245408 |
14-Jan-2013 |
kib |
The current default size of the nullfs hash table used to lookup the existing nullfs vnode by the lower vnode is only 16 slots. Since the default mode for the nullfs is to cache the vnodes, hash has extremely huge chains.
Size the nullfs hashtbl based on the current value of desiredvnodes. Use vfs_hash_index() to calculate the hash bucket for a given vnode.
Pointy hat to: kib Diagnosed and reviewed by: peter Tested by: peter, pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 5 days
|
245262 |
10-Jan-2013 |
kib |
When nullfs mount is forcibly unmounted and nullfs vnode is reclaimed, get back the leased write reference from the lower vnode. There is no other path which can correct v_writecount on the lowervp.
Reported by: flo Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days
|
245164 |
08-Jan-2013 |
bapt |
Add support for IO_APPEND flag in fuse This make open(..., O_APPEND) actually works on fuse filesystem.
Reviewed by: attilio
|
245121 |
07-Jan-2013 |
pfg |
ext2fs: cleanup de dinode structure.
It was plagued with style errors and the offsets had been lost. While here took the time to update the fields according to the latest ext4 documentation.
Reviewed by: bde MFC after: 3 days
|
245115 |
06-Jan-2013 |
gleb |
tmpfs: Replace directory entry linked list with RB-Tree.
Use file name hash as a tree key, handle duplicate keys. Both VOP_LOOKUP and VOP_READDIR operations utilize same tree for search. Directory entry offset (cookie) is either file name hash or incremental id in case of hash collisions (duplicate-cookies). Keep sorted per directory list of duplicate-cookie entries to facilitate cookie number allocation.
Don't fail if previous VOP_READDIR() offset is no longer valid, start with next dirent instead. Other file system handle it similarly.
Workaround race prone tn_readdir_last[pn] fields update.
Add tmpfs_dir_destroy() to free all dirents.
Set NFS cookies in tmpfs_dir_getdents(). Return EJUSTRETURN from tmpfs_dir_getdents() instead of hard coded -1.
Mark directory traversal routines static as they are no longer used outside of tmpfs_subr.c
|
245033 |
04-Jan-2013 |
kib |
Fix reversed condition in the assertion.
Pointy hat to: kib MFC after: 13 days
|
245004 |
03-Jan-2013 |
kib |
Add the "nocache" nullfs mount option, which disables the caching of the free nullfs vnodes, switching nullfs behaviour to pre-r240285. The option is mostly intended as the last-resort when higher pressure on the vnode cache due to doubling of the vnode counts is not desirable.
Note that disabling the cache costs more than 2x wall time in the metadata-hungry scenarious. The default is "cache".
Tested and benchmarked by: pho (previous version) MFC after: 2 weeks
|
245000 |
03-Jan-2013 |
kib |
Remove the last use of the deprecated MNT_VNODE_FOREACH interface in the tree.
With the help from: mjg Tested by: Ronald Klop <ronald-freebsd8@klop.yi.org> MFC after: 2 weeks
|
244643 |
23-Dec-2012 |
kib |
Do not force a writer to the devfs file to drain the buffer writes.
Requested and tested by: Ian Lepore <freebsd@damnhippie.dyndns.org> MFC after: 2 weeks
|
244475 |
20-Dec-2012 |
pfg |
More constant renaming in preparation for newer features.
We also try to make better use of the fs flags instead of trying adapt the code according to the fs structures. In the case of subsecond timestamps and birthtime we now check that the feature is explicitly enabled: previously we only checked that the reserved space was available and silently wrote them.
This approach is much safer, especially if the filesystem happens to use embedded inodes or support EAs.
Discussed with: Zheng Liu MFC after: 5 days
|
244056 |
09-Dec-2012 |
rmacklem |
Add "nfsstat -m" support for the two new NFS mount options added by r244042.
|
244042 |
08-Dec-2012 |
rmacklem |
Move the NFSv4.1 client patches over from projects/nfsv4.1-client to head. I don't think the NFS client behaviour will change unless the new "minorversion=1" mount option is used. It includes basic NFSv4.1 support plus support for pNFS using the Files Layout only. All problems detecting during an NFSv4.1 Bakeathon testing event in June 2012 have been resolved in this code and it has been tested against the NFSv4.1 server available to me. Although not reviewed, I believe that kib@ has looked at it.
|
243882 |
05-Dec-2012 |
glebius |
Mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys.
Exceptions:
- sys/contrib not touched - sys/mbuf.h edited manually
|
243782 |
02-Dec-2012 |
rmacklem |
Add an nfssvc() option to the kernel for the new NFS client which dumps out the actual options being used by an NFS mount. This will be used to implement a "-m" option for nfsstat(1).
Reviewed by: alfred MFC after: 2 weeks
|
243652 |
28-Nov-2012 |
pfg |
Update some definitions or make them match NetBSD's headers.
Bring several definitions required for newer ext4 features.
Rename EXT2F_COMPAT_HTREE to EXT2F_COMPAT_DIRHASHINDEX since it is not being used yet and the new name is more compatible with NetBSD and Linux.
This change is purely cosmetic and has no effect on the real code.
Obtained from: NetBSD MFC after: 3 days
|
243641 |
28-Nov-2012 |
pfg |
Partially bring r242520 to ext2fs.
When a file is first being written, the dynamic block reallocation (implemented by ext2_reallocblks) relocates the file's blocks so as to cluster them together into a contiguous set of blocks on the disk.
When the cluster crosses the boundary into the first indirect block, the first indirect block is initially allocated in a position immediately following the last direct block. Block reallocation would usually destroy locality by moving the indirect block out of the way to keep the data blocks contiguous.
The issue was diagnosed long ago by Bruce Evans on ffs and surfaced on ext2fs when block reallocaton was ported. This is only a partial solution based on the similarities with FFS. We still require more review of the allocation details that vary in ext2fs.
Reported by: bde MFC after: 1 week
|
243548 |
26-Nov-2012 |
davide |
- smbfs_rename() might return an error value without correctly upgrading the vnode use count, and this might cause the kernel to panic if compiled with WITNESS enable. - Be sure to put the '\0' terminator to the rpath string.
Sponsored by: iXsystems inc.
|
243397 |
22-Nov-2012 |
davide |
- Remove reset of vpp pointer in some places as long as it's not really useful and has the side effect of obfuscating the code a bit. - Remove spurious references to simple_lock.
Reported by: attilio [1] Sponsored by: iXsystems inc.
|
243396 |
22-Nov-2012 |
davide |
Until now, smbfs_fullpath() computed the full path starting from the vnode and following back the chain of n_parent pointers up to the root, without acquiring the locks of the n_parent vnodes analyzed during the computation. This is immediately wrong because if the vnode lock is not held there's no guarantee on the validity of the vnode pointer or the data. In order to fix, store the whole path in the smbnode structure so that smbfs_fullpath() can use this information.
Discussed with: kib Reported and tested by: pho Sponsored by: iXsystems inc.
|
243340 |
20-Nov-2012 |
kib |
Remove the check and panic for an impossible condition. The NULL lowervp vnode v_vnlock would cause panic due to NULL pointer dereference much earlier.
MFC after: 1 week
|
243311 |
19-Nov-2012 |
attilio |
r16312 is not any longer real since many years (likely since when VFS received granular locking) but the comment present in UFS has been copied all over other filesystems code incorrectly for several times.
Removes comments that makes no sense now.
Reviewed by: kib MFC after: 3 days
|
243142 |
16-Nov-2012 |
kib |
In pget(9), if PGET_NOTWEXIT flag is not specified, also search the zombie list for the pid. This allows several kern.proc sysctls to report useful information for zombies.
Hold the allproc_lock around all searches instead of relocking it. Remove private pfind_locked() from the new nfs client code.
Requested and reviewed by: pjd Tested by: pho MFC after: 3 weeks
|
243039 |
14-Nov-2012 |
kib |
Remove M_USE_RESERVE from the devfs cdp allocator, which is one of two uses of M_USE_RESERVE in the kernel. This allocation is not special.
Reviewed by: alc Tested by: pho MFC after: 2 weeks
|
243038 |
14-Nov-2012 |
davide |
Get rid of some old debug code. It provides checks similar to the one offered by RedZone so there's no need to keep it.
Sponsored by: iXsystems inc.
|
243033 |
14-Nov-2012 |
davide |
Fix the lookup in the DOTDOT case in the same way as other filesystems do, i.e. inlining the vn_vget_ino() algorithm.
Sponsored by: iXsystems inc.
|
242875 |
10-Nov-2012 |
attilio |
- Protect mnt_data and mnt_flags under the mount interlock - Move mp->mnt_stat manipulation where all of them happens
Reported by: davide Discussed with: kib Tested by: flo MFC after: 2 months X-MFC: 241519, 242536,242616, 242727
|
242833 |
09-Nov-2012 |
attilio |
Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag. Porters should refer to __FreeBSD_version 1000021 for this change as it may have happened at the same timeframe.
|
242727 |
08-Nov-2012 |
attilio |
- Current caching mode is completely broken because it simply relies on timing of the operations and not real lookup, bringing too many false positives. Remove the whole mechanism. If it needs to be implemented, next time it should really be done in the proper way. - Fix VOP_GETATTR() in order to cope with userland bugs that would change the type of file and not panic. Instead it gets the entry as if it is not existing.
Reported and tested by: flo MFC after: 2 months X-MFC: 241519, 242536,242616
|
242616 |
05-Nov-2012 |
attilio |
fuse_io* must be able to crunch also VDIR vnodes. Update assert appropriately.
Reported and Tested by: flo MFC after: 2 months X-MFC: 241519,242536
|
242536 |
03-Nov-2012 |
attilio |
Fix a bug where operations was carried on even if not implemented, leading to handling of an invalid fdip object.
Reported and tested by: flo MFC after: 2 months X-MFC: 241519
|
242476 |
02-Nov-2012 |
kib |
The r241025 fixed the case when a binary, executed from nullfs mount, was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay.
Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks.
Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph.
Tested by: pho MFC after: 3 weeks
|
242387 |
31-Oct-2012 |
davide |
- Do not put in the mntqueue half-constructed vnodes. - Change the code so that it relies on vfs_hash rather than on a home-made hashtable. - There's no need to inline fnv_32_buf().
Reviewed by: delphij Tested by: pho Sponsored by: iXsystems inc.
|
242386 |
31-Oct-2012 |
davide |
Fix panic due to page faults while in kernel mode, under conditions of VM pressure. The reason is that in some codepaths pointers to stack variables were passed from one thread to another.
In collaboration with: pho Reported by: pho's stress2 suite Sponsored by: iXsystems inc.
|
242384 |
31-Oct-2012 |
davide |
Change the code to use %jd as printf() placeholder for uio_offset and cast to intmax_t.
Suggested by: pjd Sponsored by: iXsystems inc.
|
242097 |
25-Oct-2012 |
davide |
Fix build in case we have SMBVDEBUG turned on.
Reviewed by: gnn Approved by: gnn Sponsored by: iXsystems inc.
|
242092 |
25-Oct-2012 |
davide |
- Remove the references to the deprecated zalloc kernel interface - Use M_ZERO flag in malloc() rather than bzero() - malloc() with M_NOWAIT can't return NULL so there's no need to check
Reviewed by: alc Approved by: alc
|
241896 |
22-Oct-2012 |
kib |
Remove the support for using non-mpsafe filesystem modules.
In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes.
Conducted and reviewed by: attilio Tested by: pho
|
241844 |
22-Oct-2012 |
eadler |
remove duplicate semicolons where possible.
Approved by: cperciva MFC after: 1 week
|
241702 |
18-Oct-2012 |
ed |
Remove unneeded D_NEEDMINOR.
This is only needed when using clonelists. This got remove in r238693.
|
241561 |
14-Oct-2012 |
rmacklem |
Add two new options to the nfssvc(2) syscall that allow processes running as root to suspend/resume execution of the kernel nfsd threads. An earlier version of this patch was tested by Vincent Hoffman (vince at unsane.co.uk) and John Hickey (jh at deterlab.net).
Reviewed by: kib MFC after: 2 weeks
|
241554 |
14-Oct-2012 |
kib |
Grammar fixes.
Submitted by: bf MFC after: 1 week
|
241548 |
14-Oct-2012 |
kib |
Replace the XXX comment with the proper description.
MFC after: 1 week
|
241521 |
14-Oct-2012 |
attilio |
Rename s/DEBUG()/FS_DEBUG() and s/DEBUG2G()/FS_DEBUG2G() in order to avoid a name clash in sparc64.
MFC after: 2 months X-MFC: r241519
|
241519 |
14-Oct-2012 |
attilio |
Import a FreeBSD port of the FUSE Linux module. This has been developed during 2 summer of code mandates and being revived by gnn recently. The functionality in this commit mirrors entirely content of fusefs-kmod port, which doesn't need to be installed anymore for -CURRENT setups.
In order to get some sparse technical notes, please refer to: http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html
or to the project branch: svn://svn.freebsd.org/base/projects/fuse/
which also contains granular history of changes happened during port refinements. This commit does not came from the branch reintegration itself because it seems svn is not behaving properly for this functionaly at the moment.
Partly Sponsored by: Google, Summer of Code program 2005, 2011 Originally submitted by: ilya, Csaba Henk <csaba-ml AT creo DOT hu > In collabouration with: pho Tested by: flo, gnn, Gustau Perez, Kevin Oberman <rkoberman AT gmail DOT com> MFC after: 2 months
|
241025 |
28-Sep-2012 |
kib |
Fix the mis-handling of the VV_TEXT on the nullfs vnodes.
If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write.
Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode.
Tested by: pho (previous version) MFC after: 2 weeks
|
241011 |
27-Sep-2012 |
mdf |
Fix up kernel sources to be ready for a 64-bit ino_t.
Original code by: Gleb Kurtsou
|
240720 |
20-Sep-2012 |
rmacklem |
Modify the NFSv4 client so that it can handle owner and owner_group strings that consist entirely of digits, interpreting them as the uid/gid number. This change was needed since new (>= 3.3) Linux servers reply with these strings by default. This change is mandated by the rfc3530bis draft. Reported on freebsd-stable@ under the Subject heading "Problem with Linux >= 3.3 as NFSv4 server" by Norbert Aschendorff on Aug. 20, 2012.
Tested by: norbert.aschendorff at yahoo.de Reviewed by: jhb MFC after: 2 weeks
|
240539 |
15-Sep-2012 |
ed |
Prefer __containerof() above member2struct().
The first does proper checking of the argument types, while the latter does not.
|
240464 |
13-Sep-2012 |
kib |
The deadfs VOPs for vop_ioctl and vop_bmap call itself recursively, which is an elaborate way to cause kernel panic. Change the VOPs implementation to return EBADF for a reclaimed vnode.
While the calls to vop_bmap should not reach deadfs, it is indeed possible for vop_ioctl, because the VOP locking protocol is to pass the vnode to VOP unlocked. The actual panic was observed when ioctl was called on procfs filedescriptor which pointed to an exited process.
Reported by: zont Tested by: pho MFC after: 1 week
|
240379 |
12-Sep-2012 |
kevlo |
Add VFCF_READONLY flag that indicates ntfs and xfs file systems are only supported as read-only.
|
240358 |
11-Sep-2012 |
kevlo |
Prevent nump NULL pointer dereference in bmap_getlbns()
|
240355 |
11-Sep-2012 |
kevlo |
Fix style nit
|
240289 |
09-Sep-2012 |
rmacklem |
Add a simple printf() based debug facility to the new nfs client. Use it for a printf() that can be harmlessly generated for mmap()'d files. It will be used extensively for the NFSv4.1 client. Debugging printf()s are enabled by setting vfs.nfs.debuglevel to a non-zero value. The higher the value, the more debugging printf()s.
Reviewed by: jhb MFC after: 2 weeks
|
240285 |
09-Sep-2012 |
kib |
Allow shared lookups for nullfs mounts, if lower filesystem supports it. There are two problems which shall be addressed for shared lookups use to have measurable effect on nullfs scalability:
1. When vfs_lookup() calls VOP_LOOKUP() for nullfs, which passes lookup operation to lower fs, resulting vnode is often only shared-locked. Then null_nodeget() cannot instantiate covering vnode for lower vnode, since insmntque1() and null_hashins() require exclusive lock on the lower.
Change the assert that lower vnode is exclusively locked to only require any lock. If null hash failed to find pre-existing nullfs vnode for lower vnode and the vnode is shared-locked, the lower vnode lock is upgraded.
2. Nullfs reclaims its vnodes on deactivation. This is due to nullfs inability to detect reclamation of the lower vnode. Reclamation of a nullfs vnode at deactivation time prevents a reference to the lower vnode to become stale.
Change nullfs VOP_INACTIVE to not reclaim the vnode, instead use the VFS_RECLAIM_LOWERVP to get notification and reclaim upper vnode together with the reclamation of the lower vnode.
Note that nullfs reclamation procedure calls vput() on the lowervp vnode, temporary unlocking the vnode being reclaimed. This seems to be fine for MPSAFE filesystems, but not-MPSAFE code often put partially initialized vnode on some globally visible list, and later can decide that half-constructed vnode is not needed. If nullfs mount is created above such filesystem, then other threads might catch such not properly initialized vnode. Instead of trying to overcome this case, e.g. by recursing the lower vnode lock in null_reclaim_lowervp(), I decided to rely on nearby removal of the support for non-MPSAFE filesystems.
In collaboration with: pho MFC after: 3 weeks
|
239636 |
24-Aug-2012 |
pfg |
Add some basic definitions for a future htree implementation.
MFC after: 3 days
|
239372 |
18-Aug-2012 |
kevlo |
Fix typo
|
239359 |
17-Aug-2012 |
mjg |
Remove unused member of struct indir (in_exists) from UFS and EXT2 code.
Reviewed by: mckusick Approved by: trasz (mentor) MFC after: 1 week
|
239303 |
15-Aug-2012 |
hselasky |
Streamline use of cdevpriv and correct some corner cases.
1) It is not useful to call "devfs_clear_cdevpriv()" from "d_close" callbacks, hence for example read, write, ioctl and so on might be sleeping at the time of "d_close" being called and then then freed private data can still be accessed. Examples: dtrace, linux_compat, ksyms (all fixed by this patch)
2) In sys/dev/drm* there are some cases in which memory will be freed twice, if open fails, first by code in the open routine, secondly by the cdevpriv destructor. Move registration of the cdevpriv to the end of the drm open routines.
3) devfs_clear_cdevpriv() is not called if the "d_open" callback registered cdevpriv data and the "d_open" callback function returned an error. Fix this.
Discussed with: phk MFC after: 2 weeks
|
239246 |
14-Aug-2012 |
kib |
Do not leave invalid pages in the object after the short read for a network file systems (not only NFS proper). Short reads cause pages other then the requested one, which were not filled by read response, to stay invalid.
Change the vm_page_readahead_finish() interface to not take the error code, but instead to make a decision to free or to (de)activate the page only by its validity. As result, not requested invalid pages are freed even if the read RPC indicated success.
Noted and reviewed by: alc MFC after: 1 week
|
239065 |
05-Aug-2012 |
kib |
After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages.
Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it.
Suggested and reviewed by: alc MFC after: 2 weeks
|
239040 |
04-Aug-2012 |
kib |
Reduce code duplication and exposure of direct access to struct vm_page oflags by providing helper function vm_page_readahead_finish(), which handles completed reads for pages with indexes other then the requested one, for VOP_GETPAGES().
Reviewed by: alc MFC after: 1 week
|
239039 |
04-Aug-2012 |
kib |
The header uma_int.h is internal uma header, unused by this source file. Do not include it needlessly.
Reviewed by: alc MFC after: 1 week
|
238936 |
31-Jul-2012 |
davidxu |
I am comparing current pipe code with the one in 8.3-STABLE r236165, I found 8.3 is a history BSD version using socket to implement FIFO pipe, it uses per-file seqcount to compare with writer generation stored in per-pipe object. The concept is after all writers are gone, the pipe enters next generation, all old readers have not closed the pipe should get the indication that the pipe is disconnected, result is they should get EPIPE, SIGPIPE or get POLLHUP in poll(). But newcomer should not know that previous writters were gone, it should treat it as a fresh session. I am trying to bring back FIFO pipe to history behavior. It is still unclear that if single EOF flag can represent SBS_CANTSENDMORE and SBS_CANTRCVMORE which socket-based version is using, but I have run the poll regression test in tool directory, output is same as the one on 8.3-STABLE now. I think the output "not ok 18 FIFO state 6b: poll result 0 expected 1. expected POLLHUP; got 0" might be bogus, because newcomer should not know that old writers were gone. I got the same behavior on Linux. Our implementation always return POLLIN for disconnected pipe even it should return POLLHUP, but I think it is not wise to remove POLLIN for compatible reason, this is our history behavior.
Regression test: /usr/src/tools/regression/poll
|
238928 |
31-Jul-2012 |
davidxu |
When a thread is blocked in direct write state, it only sets PIPE_DIRECTW flag but not PIPE_WANTW, but FIFO pipe code does not understand this internal state, when a FIFO peer reader closes the pipe, it wants to notify the writer, it checks PIPE_WANTW, if not set, it skips calling wakeup(), so blocked writer never noticed the case, but in general, the writer should return from the syscall with EPIPE error code and may get SIGPIPE signal. Setting the PIPE_WANTW fixed problem, or you can turn off direct write, it should fix the problem too. This bug is found by PR/170203.
Another bug in FIFO pipe code is when peer closes the pipe, another end which is being blocked in select() or poll() is not notified, it missed to call pipeselwakeup().
Third problem is found in poll regression test, the existing code can not pass 6b,6c,6d tests, but FreeBSD-4 works. This commit does not fix the problem, I still need to study more to find the cause.
PR: 170203 Tested by: Garrett Copper < yanegomi at gmail dot com >
|
238697 |
22-Jul-2012 |
kevlo |
Use NULL instead of 0 for pointers
|
238539 |
16-Jul-2012 |
brueffer |
Simply error handling by moving the allocation of np down to where it is actually used. While here, improve style a little.
Submitted by: mjg MFC after: 2 weeks
|
238491 |
15-Jul-2012 |
brueffer |
Save a bzero() by using M_ZERO.
Obtained from: Dragonfly BSD (change 4faaf07c3d7ddd120deed007370aaf4d90b72ebb) MFC after: 2 weeks
|
238320 |
10-Jul-2012 |
attilio |
Remove a check on MNTK_UPDATE that is not really necessary as it is handled in a code snippet above.
|
238315 |
10-Jul-2012 |
attilio |
- Remove the unused and not completed write support for NTFS. - Fix a bug where vfs_mountedfrom() is called also when the filesystem is not mounted successfully.
Tested by: pho
|
238059 |
03-Jul-2012 |
kevlo |
Fix a typo
|
238029 |
02-Jul-2012 |
kib |
Extend the KPI to lock and unlock f_offset member of struct file. It now fully encapsulates all accesses to f_offset, and extends f_offset locking to other consumers that need it, in particular, to lseek() and variants of getdirentries().
Ensure that on 32bit architectures f_offset, which is 64bit quantity, always read and written under the mtxpool protection. This fixes apparently easy to trigger race when parallel lseek()s or lseek() and read/write could destroy file offset.
The already broken ABI emulations, including iBCS and SysV, are not converted (yet).
Tested by: pho No objections from: jhb MFC after: 3 weeks
|
237987 |
02-Jul-2012 |
kib |
Do not override an error from uiomove() with (non-)error result from bwrite(). VFS needs to know about EFAULT from uiomove() and does not care much that partially filled block writeback after EFAULT was successfull. Early return without error causes short write to be reported to usermode.
Reported and tested by: andreast MFC after: 3 weeks
|
237367 |
21-Jun-2012 |
kib |
Enable deadlock avoidance code for NFS client.
MFC after: 2 weeks
|
237244 |
18-Jun-2012 |
rmacklem |
Fix the NFSv4 client for the case where mmap'd files are written, but not msync'd by a process. A VOP_PUTPAGES() called when VOP_RECLAIM() happens will usually fail, since the NFSv4 Open has already been closed by VOP_INACTIVE(). Add a vm_object_page_clean() call to the NFSv4 client's VOP_INACTIVE(), so that the write happens before the NFSv4 Open is closed. kib@ suggested using vgone() instead and I will explore this, but this patch fixes things in the meantime. For some reason, the VOP_PUTPAGES() is still attaempted in VOP_RECLAIM(), but having this fail doesn't cause any problems except a "stateid0 in write" being logged.
Reviewed by: kib MFC after: 1 week
|
237200 |
17-Jun-2012 |
rmacklem |
Move the nfsrpc_close() call in ncl_reclaim() for the NFSv4 client to below the vnode_destroy_vobject() call, since that is where writes are flushed.
Suggested by: kib MFC after: 1 week
|
236687 |
06-Jun-2012 |
kib |
Improve handling of uiomove(9) errors for the NFS client.
Do not brelse() the buffer unconditionally with BIO_ERROR set if uiomove() failed. The brelse() treats most buffers with BIO_ERROR as B_INVAL, dropping their content. Instead, if the write request covered the whole buffer, remember the cached state and brelse() with BIO_ERROR set only if the buffer was not cached previously.
Update the buffer dirtyoff/dirtyend based on the progress recorded by uiomove() in passed struct uio, even in the presence of error. Otherwise, usermode could see changed data in the backed pages, but later the buffer is destroyed without write-back.
If uiomove() failed for IO_UNIT request, try to truncate the vnode back to the pre-write state, and rewind the progress in passed uio accordingly, following the FFS behaviour.
Reviewed by: rmacklem (some time ago) Tested by: pho MFC after: 1 month
|
236313 |
30-May-2012 |
kib |
Capitalize start of sentence.
MFC after: 3 days
|
236188 |
28-May-2012 |
marcel |
Catch a corner case where ssegs could be 0 and thus i would be 0 and we index suinfo out of bounds (i.e. -1).
Approved by: gber
|
236140 |
27-May-2012 |
ed |
Fix style and consistency:
- Use tabs, not spaces. - Add tab after #define. - Don't mix the use of BSD and ISO C unsigned integer types. Prefer the ISO C ones.
|
235984 |
25-May-2012 |
gleb |
Use C99-style initialization for struct dirent in preparation for changing the structure.
Sponsored by: Google Summer of Code 2011
|
235922 |
24-May-2012 |
mav |
Revert devfs part of r235911. I was unaware about old but unfinished discussion between kib@ and gibbs@ about it.
|
235911 |
24-May-2012 |
mav |
MFprojects/zfsd: Revamp the CAM enclosure services driver. This updated driver uses an in-kernel daemon to track state changes and publishes physical path location information\for disk elements into the CAM device database.
Sponsored by: Spectra Logic Corporation Sponsored by: iXsystems, Inc. Submitted by: gibbs, will, mav
|
235568 |
17-May-2012 |
rmacklem |
A problem with the NFSv4 server was reported by Andrew Leonard to freebsd-fs@, where the setfacl of an NFSv4 acl would fail. This was caused by the VOP_ACLCHECK() call for ZFS replying EOPNOTSUPP. After discussion with rwatson@, it was determined that a call to VOP_ACLCHECK() before doing VOP_SETACL() is not required. This patch fixes the problem by deleting the VOP_ACLCHECK() call.
Tested by: Andrew Leonard (previous version) MFC after: 1 week
|
235537 |
17-May-2012 |
gber |
Import work done under project/nand (@235533) into head.
The NAND Flash environment consists of several distinct components: - NAND framework (drivers harness for NAND controllers and NAND chips) - NAND simulator (NANDsim) - NAND file system (NAND FS) - Companion tools and utilities - Documentation (manual pages)
This work is still experimental. Please use with caution.
Obtained from: Semihalf Supported by: FreeBSD Foundation, Juniper Networks
|
235508 |
16-May-2012 |
pfg |
Fix a couple of issues that appear to be inherited from the old 8.x code: - If the lock cannot be acquired immediately unlocks 'bar' vnode and then locks both vnodes in order. - wrong vnode type panics from cache_enter_time after calls by ext2_lookup.
The fix merges the fixes from ufs/ufs_lookup.c.
Submitted by: Mateusz Guzik Approved by: jhb@ (mentor) Reviewed by: kib@ MFC after: 1 week
|
235503 |
16-May-2012 |
gleb |
Skip directory entries with zero inode number during traversal.
Entries with zero inode number are considered placeholders by libc and UFS. Fix remaining uses of VOP_READDIR in kernel: vop_stdvptocnp, unionfs.
Sponsored by: Google Summer of Code 2011
|
235381 |
12-May-2012 |
rmacklem |
Fix two cases in the new NFS server where a tsleep() is used, when the code should actually protect the tested variable with a mutex. Since the tsleep()s had a 10sec timeout, the race would have only delayed the allocation of a new clientid for a client. The sleeps will also rarely occur, since having a callback in progress when a client acquires a new clientid, is unlikely. in practice, since having a callback in progress when a fresh clientid is being acquired by a client is unlikely.
MFC after: 1 month
|
235332 |
12-May-2012 |
rmacklem |
PR# 165923 reported intermittent write failures for dirty memory mapped pages being written back on an NFS mount. Since any thread can call VOP_PUTPAGES() to write back a dirty page, the credentials of that thread may not have write access to the file on an NFS server. (Often the uid is 0, which may be mapped to "nobody" in the NFS server.) Although there is no completely correct fix for this (NFS servers check access on every write RPC instead of at open/mmap time), this patch avoids the common cases by holding onto a credential that recently opened the file for writing and uses that credential for the write RPCs being done by VOP_PUTPAGES() for both NFS clients.
Tested by: Joel Ray Holveck (joelh at juniper.net) PR: kern/165923 Reviewed by: kib MFC after: 2 weeks
|
235241 |
10-May-2012 |
pluknet |
Fix mount interlock oversights from the previous change in r234386.
Reported by: dougb Submitted by: Mateusz Guzik <mjguzik at gmail com> Reviewed by: Kirk McKusick Tested by: pho
|
235136 |
08-May-2012 |
jwd |
Use the common api helper routine instead of freeing the namei buffer directly.
Approved by: rmacklem (mentor) MFC after: 1 month
|
234944 |
03-May-2012 |
daichi |
fixed a unionfs_readdir math issue
PR: 132987 Submitted by: Matthew Fleming <mfleming@isilon.com>
|
234867 |
01-May-2012 |
daichi |
- fixed a vnode lock hang-up issue. - fixed an incorrect lock status issue. - fixed an incorrect lock issue of unionfs root vnode removed. (pointed out by keith) - fixed an infinity loop issue. (pointed out by dumbbell) - changed to do LK_RELEASE expressly when unlocked.
Submitted by: ozawa@ongs.co.jp
|
234742 |
27-Apr-2012 |
rmacklem |
It was reported via email that some non-FreeBSD NFS servers do not include file attributes in the reply to an NFS create RPC under certain circumstances. This resulted in a vnode of type VNON that was not usable. This patch adds an NFS getattr RPC to nfs_create() for this case, to fix the problem. It was tested by the person that reported the problem and confirmed to fix this case for their server.
Tested by: Steven Haber (steven.haber at isilon.com) MFC after: 2 weeks
|
234740 |
27-Apr-2012 |
rmacklem |
Fix a leak of namei lookup path buffers that occurs when a ZFS volume is exported via the new NFS server. The leak occurred because the new NFS server code didn't handle the case where a file system sets the SAVENAME flag in its VOP_LOOKUP() and ZFS does this for the DELETE case.
Tested by: Oliver Brandmueller (ob at gruft.de), hrs PR: kern/167266 MFC after: 1 month
|
234607 |
23-Apr-2012 |
trasz |
Remove unused thread argument to vrecycle().
Reviewed by: kib
|
234605 |
23-Apr-2012 |
trasz |
Remove unused thread argument from vtruncbuf().
Reviewed by: kib
|
234482 |
20-Apr-2012 |
mckusick |
This change creates a new list of active vnodes associated with a mount point. Active vnodes are those with a non-zero use or hold count, e.g., those vnodes that are not on the free list. Note that this list is in addition to the list of all the vnodes associated with a mount point.
To avoid adding another set of linkage pointers to the vnode structure, the active list uses the existing linkage pointers used by the free list (previously named v_freelist, now renamed v_actfreelist).
This update adds the MNT_VNODE_FOREACH_ACTIVE interface that loops over just the active vnodes associated with a mount point (typically less than 1% of the vnodes associated with the mount point).
Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks
|
234422 |
18-Apr-2012 |
jh |
Return EOPNOTSUPP rather than EPERM for the SF_SNAPSHOT flag because tmpfs doesn't support snapshots.
Suggested by: bde
|
234386 |
17-Apr-2012 |
mckusick |
Replace the MNT_VNODE_FOREACH interface with MNT_VNODE_FOREACH_ALL. The primary changes are that the user of the interface no longer needs to manage the mount-mutex locking and that the vnode that is returned has its mutex locked (thus avoiding the need to check to see if its is DOOMED or other possible end of life senarios).
To minimize compatibility issues for third-party developers, the old MNT_VNODE_FOREACH interface will remain available so that this change can be MFC'ed to 9. Following the MFC to 9, MNT_VNODE_FOREACH will be removed in head.
The reason for this update is to prepare for the addition of the MNT_VNODE_FOREACH_ACTIVE interface that will loop over just the active vnodes associated with a mount point (typically less than 1% of the vnodes associated with the mount point).
Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks
|
234347 |
16-Apr-2012 |
jh |
Sync tmpfs_chflags() with the recent changes to UFS:
- Add a check for unsupported file flags. - Return EPERM when an user without PRIV_VFS_SYSFLAGS privilege attempts to toggle SF_SETTABLE flags.
|
234346 |
16-Apr-2012 |
jh |
tmpfs: Allow update mounts only for certain options.
Since r230208 update mounts were allowed if the list of mount options contained the "export" option. This is not correct as tmpfs doesn't really support updating all options.
Reviewed by: kevlo, trociny
|
234325 |
15-Apr-2012 |
gleb |
Provide better description for vfs.tmpfs.memory_reserved sysctl.
Suggested by: Anton Yuzhaninov <citrin@citrin.ru>
|
234203 |
13-Apr-2012 |
jh |
Apply changes from r234103 to ext2fs:
Return EPERM from ext2_setattr() when an user without PRIV_VFS_SYSFLAGS privilege attempts to toggle SF_SETTABLE flags.
Flags are now stored to ip->i_flags in one place after all checks.
Also, remove SF_NOUNLINK from the checks because ext2fs doesn't support that flag.
Reviewed by: bde
|
234139 |
11-Apr-2012 |
jh |
Restore the blank line incorrectly removed in r234104.
Pointed out by: bde
|
234104 |
10-Apr-2012 |
jh |
Apply changes from r233787 to ext2fs:
- Use more natural ip->i_flags instead of vap->va_flags in the final flags check. - Style improvements.
No functional change intended.
MFC after: 2 weeks
|
234064 |
09-Apr-2012 |
attilio |
- Introduce a cache-miss optimization for consistency with other accesses of the cache member of vm_object objects. - Use novel vm_page_is_cached() for checks outside of the vm subsystem.
Reviewed by: alc MFC after: 2 weeks X-MFC: r234039
|
234025 |
08-Apr-2012 |
mckusick |
Add I/O accounting to msdos filesystem.
Suggested and reviewed by: kib
|
234000 |
07-Apr-2012 |
gleb |
tmpfs supports only INT_MAX nodes due to limitations of unit number allocator.
Replace UINT32_MAX checks with INT_MAX. Keeping more than 2^31 nodes in memory is not likely to become possible in foreseeable feature and would require new unit number allocator.
Discussed with: delphij MFC after: 2 weeks
|
233999 |
07-Apr-2012 |
gleb |
Add vfs_getopt_size. Support human readable file system options in tmpfs.
Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs.
Discussed with: delphij MFC after: 2 weeks
|
233998 |
07-Apr-2012 |
gleb |
Add reserved memory limit sysctl to tmpfs.
Cleanup availble and used memory functions. Check if free pages available before allocating new node.
Discussed with: delphij
|
233101 |
17-Mar-2012 |
kib |
Add sysctl vfs.nfs.nfs_keep_dirty_on_error to switch the nfs client behaviour on error from write RPC back to behaviour of old nfs client. When set to not zero, the pages for which write failed are kept dirty.
PR: kern/165927 Reviewed by: alc MFC after: 2 weeks
|
232960 |
14-Mar-2012 |
gleb |
Prevent tmpfs_rename() deadlock in a way similar to UFS
Unlock vnodes and try to lock them one by one. Relookup fvp and tvp.
Approved by: mdf (mentor)
|
232959 |
14-Mar-2012 |
gleb |
Don't enforce LK_RETRY to get existing vnode in tmpfs_alloc_vp()
Doomed vnode is hardly of any use here, besides all callers handle error case. vfs_hash_get() does the same.
Don't mess with vnode holdcount, vget() takes care of it already.
Approved by: mdf (mentor)
|
232918 |
13-Mar-2012 |
kevlo |
Use NULL instead of 0
|
232823 |
11-Mar-2012 |
kib |
Update comment.
Submitted by: gianni
|
232821 |
11-Mar-2012 |
kib |
Remove fifo.h. The only used function declaration from the header is migrated to sys/vnode.h.
Submitted by: gianni
|
232703 |
08-Mar-2012 |
pfg |
Add support for ns timestamps and birthtime to the ext2/3 driver.
When using big inodes there is sufficient space in ext3 to keep extra resolution and birthtime (creation) timestamps. The appropriate fields in the on-disk inode have been approved for a long time but support for this in ext3 has not been widely distributed.
In preparation for ext4 most linux distributions have enabled by default such bigger inodes and some people use nanosecond timestamps in ext3. We now support those when the inode is big enough and while we do recognize the EXT4F_ROCOMPAT_EXTRA_ISIZE, we maintain the extra timestamps even when they are not used.
An additional note by Bruce Evans: We blindly accept unrepresentable tv_nsec in VOP_SETATTR(), but all file systems have always done that. When POSIX gets around to specifying the behaviour, it will probably require certain rounding to the fs's resolution and not rejecting the request. This unfortunately means that syscalls that set times can't really tell if they succeeded without reading back the times using stat() or similar and checking that they were set close enough.
Reviewed by: bde Approved by: jhb (mentor) MFC after: 2 weeks
|
232701 |
08-Mar-2012 |
jhb |
Add KTR_VFS traces to track modifications to a vnode's writecount.
|
232641 |
07-Mar-2012 |
kib |
The pipe_poll() performs lockless access to the vnode to test fifo_iseof() condition, allowing the v_fifoinfo to be reset and freed by fifo_cleanup().
Precalculate EOF at the places were fo_wgen is changed, and cache the state in a new pipe state flag PIPE_SAMEWGEN.
Reported and tested by: bf Submitted by: gianni MFC after: 1 week (a backport)
|
232541 |
05-Mar-2012 |
kib |
Apply inlined vn_vget_ino() algorithm for ".." lookup in pseudofs.
Reported and tested by: pho MFC after: 2 weeks
|
232493 |
04-Mar-2012 |
kib |
Remove unneeded cast to u_int. The values as small enough to fit into int, beside the use of MIN macro which performs type promotions.
Submitted by: bde MFC after: 3 weeks
|
232485 |
04-Mar-2012 |
kevlo |
Remove unnecessary casts
|
232483 |
04-Mar-2012 |
kevlo |
Clean up style(9) nits
|
232467 |
03-Mar-2012 |
rmacklem |
The name caching changes of r230394 exposed an intermittent bug in the new NFS server for NFSv4, where it would report ENOENT when the file actually existed on the server. This turned out to be caused by not initializing ni_topdir before calling lookup() and there was a rare case where the value on the stack location assigned to ni_topdir happened to be a pointer to a ".." entry, such that "dp == ndp->ni_topdir" succeeded in lookup(). This patch initializes ni_topdir to fix the problem.
MFC after: 5 days
|
232420 |
03-Mar-2012 |
rmacklem |
Post r230394, the Lookup RPC counts for both NFS clients increased significantly. Upon investigation this was caused by name cache misses for lookups of "..". For name cache entries for non-".." directories, the cache entry serves double duty. It maps both the named directory plus ".." for the parent of the directory. As such, two ctime values (one for each of the directory and its parent) need to be saved in the name cache entry. This patch adds an entry for ctime of the parent directory to the name cache. It also adds an additional uma zone for large entries with this time value, in order to minimize memory wastage. As well, it fixes a couple of cases where the mtime of the parent directory was being saved instead of ctime for positive name cache entries. With this patch, Lookup RPC counts return to values similar to pre-r230394 kernels.
Reported by: bde Discussed with: kib Reviewed by: jhb MFC after: 2 weeks
|
232401 |
02-Mar-2012 |
jhb |
Similar to the fixes in 226967 and 226987, purge any name cache entries associated with the previous vnode (if any) associated with the target of a rename(). Otherwise, a lookup of the target pathname concurrent with a rename() could re-add a name cache entry after the namei(RENAME) lookup in kern_renameat() had purged the target pathname.
MFC after: 2 weeks
|
232383 |
02-Mar-2012 |
kib |
Do not expose unlocked unconstructed nullfs vnode on mount list. Lock the native nullfs vnode lock before switching the locks.
Tested by: pho MFC after: 1 week
|
232327 |
01-Mar-2012 |
rmacklem |
Fix the NFS clients so that they use copyin() instead of bcopy(), when doing direct I/O. This direct I/O code is not enabled by default.
Submitted by: kib (earlier version) Reviewed by: kib MFC after: 1 week
|
232307 |
29-Feb-2012 |
mm |
Add "export" to devfs_opts[] and return EOPNOTSUPP if called with it. Fixes mountd warnings.
Reported by: kib MFC after: 1 week
|
232305 |
29-Feb-2012 |
kib |
Allow shared locks for reads when lower filesystem accept shared locking.
Tested by: pho MFC after: 1 week
|
232304 |
29-Feb-2012 |
kib |
Document that null_nodeget() cannot take shared-locked lowervp due to insmntque() requirements.
Tested by: pho MFC after: 1 week
|
232303 |
29-Feb-2012 |
kib |
In null_reclaim(), assert that reclaimed vnode is fully constructed, instead of accepting half-constructed vnode. Previous code cannot decide what to do with such vnode anyway, and although processing it for hash removal, paniced later when getting rid of nullfs reference on lowervp.
While there, remove initializations from the declaration block.
Tested by: pho MFC after: 1 week
|
232301 |
29-Feb-2012 |
kib |
Always request exclusive lock for the lower vnode in nullfs_vget(). The null_nodeget() requires exclusive lock on lowervp to be able to insmntque() new vnode.
Reported by: rea Tested by: pho MFC after: 1 week
|
232299 |
29-Feb-2012 |
kib |
Move the code to destroy half-contructed nullfs vnode into helper function null_destroy_proto() from null_insmntque_dtr(). Also apply null_destroy_proto() in null_nodeget() when we raced and a vnode is found in the hash, so the currently allocated protonode shall be destroyed.
Lock the vnode interlock around reassigning the v_vnlock.
In fact, this path will not be exercised after several later commits, since null_nodeget() cannot take shared-locked lowervp at all due to insmntque() requirements.
Reported by: rea Tested by: pho MFC after: 1 week
|
232296 |
29-Feb-2012 |
kib |
Merge a split multi-line comment.
MFC after: 1 week
|
232278 |
29-Feb-2012 |
mm |
Add procfs to jail-mountable filesystems.
Reviewed by: jamie MFC after: 1 week
|
232100 |
24-Feb-2012 |
kevlo |
Remove an unused structure and unnecessary cast
|
232099 |
24-Feb-2012 |
kevlo |
Check if the user has necessary permissions on the device
|
232059 |
23-Feb-2012 |
mm |
To improve control over the use of mount(8) inside a jail(8), introduce a new jail parameter node with the following parameters:
allow.mount.devfs: allow mounting the devfs filesystem inside a jail
allow.mount.nullfs: allow mounting the nullfs filesystem inside a jail
Both parameters are disabled by default (equals the behavior before devfs and nullfs in jails). Administrators have to explicitly allow mounting devfs and nullfs for each jail. The value "-1" of the devfs_ruleset parameter is removed in favor of the new allow setting.
Reviewed by: jamie Suggested by: pjd MFC after: 2 weeks
|
232055 |
23-Feb-2012 |
kmacy |
merge pipe and fifo implementations
Also reviewed by: jhb, jilles (initial revision) Tested by: pho, jilles
Submitted by: gianni Reviewed by: bde
|
232050 |
23-Feb-2012 |
rmacklem |
hrs@ reported a panic to freebsd-stable@ under the subject line "panic in 8.3-PRERELEASE" on Feb. 22, 2012. This panic was caused by use of a mix of tsleep() and msleep() calls on the same event in the new NFS server DRC code. It did "mtx_unlock(); tsleep();" in two places, which kib@ noted introduced a slight risk that the wakeup() would occur before the tsleep(), resulting in a 10sec delay before waking up. This patch fixes the problem by replacing "mtx_unlock(); tsleep();" with mtx_sleep(..PDROP..). It also changes a nfsmsleep() call to mtx_sleep() so that the code uses mtx_sleep() consistently within the file.
Tested by: hrs (in progress) Reviewed by: jhb MFC after: 5 days
|
231998 |
22-Feb-2012 |
kib |
Use DOINGASYNC() to test for async allowance, to honor VFS syncing requests.
Noted by: bde MFC after: 1 week
|
231949 |
21-Feb-2012 |
kib |
Fix found places where uio_resid is truncated to int.
Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode.
Discussed with: bde, das (previous versions) MFC after: 1 month
|
231932 |
20-Feb-2012 |
kevlo |
Remove an unnecessary cast.
|
231852 |
17-Feb-2012 |
bz |
Merge multi-FIB IPv6 support from projects/multi-fibv6/head/:
Extend the so far IPv4-only support for multiple routing tables (FIBs) introduced in r178888 to IPv6 providing feature parity.
This includes an extended rtalloc(9) KPI for IPv6, the necessary adjustments to the network stack, and user land support as in netstat.
Sponsored by: Cisco Systems, Inc. Reviewed by: melifaro (basically) MFC after: 10 days
|
231805 |
16-Feb-2012 |
rmacklem |
Delete a couple of out of date comments that are no longer true in the new NFS client.
Requested by: bde MFC after: 1 week
|
231669 |
14-Feb-2012 |
tijl |
Replace PRIdMAX with "jd" in a printf call. Cast the corresponding value to intmax_t instead of uintmax_t, because the original type is off_t.
|
231379 |
10-Feb-2012 |
ed |
Merge si_name and __si_namebuf.
The si_name pointer always points to the __si_namebuf member inside the same object. Remove it and rename __si_namebuf to si_name.
|
231269 |
09-Feb-2012 |
mm |
Allow mounting nullfs(5) inside jails.
This is now possible thanks to r230129.
MFC after: 1 month
|
231267 |
09-Feb-2012 |
mm |
Add support for mounting devfs inside jails.
A new jail(8) option "devfs_ruleset" defines the ruleset enforcement for mounting devfs inside jails. A value of -1 disables mounting devfs in jails, a value of zero means no restrictions. Nested jails can only have mounting devfs disabled or inherit parent's enforcement as jails are not allowed to view or manipulate devfs(8) rules.
Utilizes new functions introduced in r231265.
Reviewed by: jamie MFC after: 1 month
|
231265 |
09-Feb-2012 |
mm |
Introduce the "ruleset=number" option for devfs(5) mounts. Add support for updating the devfs mount (currently only changing the ruleset number is supported). Check mnt_optnew with vfs_filteropt(9).
This new option sets the specified ruleset number as the active ruleset of the new devfs mount and applies all its rules at mount time. If the specified ruleset doesn't exist, a new empty ruleset is created.
MFC after: 1 month
|
231168 |
07-Feb-2012 |
pfg |
Update the data structures with some fields reserved for ext4 but that can be used in ext3 mode.
Also adjust the internal inode to carry the birthtime, like in UFS, which is starting to get some use when big inodes are available.
Right now these are just placeholders for features to come.
Approved by: jhb (mentor) MFC after: 2 weeks
|
231133 |
07-Feb-2012 |
rmacklem |
r228827 fixed a problem where copying of NFSv4 open credentials into a credential structure would corrupt it. This happened when the p argument was != NULL. However, I now realize that the copying of open credentials should only happen for p == NULL, since that indicates that it is a read-ahead or write-behind. This patch fixes this. After this commit, r228827 could be reverted, but I think the code is clearer and safer with the patch, so I am going to leave it in. Without this patch, it was possible that a NFSv4 VOP_SETATTR() could have changed the credentials of the caller. This would have happened if the process doing the VOP_SETATTR() did not have the file open, but some other process running as a different uid had the file open for writing at the same time.
MFC after: 5 days
|
231088 |
06-Feb-2012 |
jhb |
Rename cache_lookup_times() to cache_lookup() and retire the old API and ABI stub for cache_lookup().
|
231075 |
06-Feb-2012 |
kib |
Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity.
Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC.
Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons.
Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks
|
230803 |
31-Jan-2012 |
rmacklem |
When a "mount -u" switches an NFS mount point from TCP to UDP, any thread doing an I/O RPC with a transfer size greater than NFS_UDPMAXDATA will be hung indefinitely, retrying the RPC. After a discussion on freebsd-fs@, I decided to add a warning message for this case, as suggested by Jeremy Chadwick.
Suggested by: freebsd at jdc.parodius.com (Jeremy Chadwick) MFC after: 2 weeks
|
230605 |
27-Jan-2012 |
rmacklem |
A problem with respect to data read through the buffer cache for both NFS clients was reported to freebsd-fs@ under the subject "NFS corruption in recent HEAD" on Nov. 26, 2011. This problem occurred when a TCP mounted root fs was changed to using UDP. I believe that this problem was caused by the change in mnt_stat.f_iosize that occurred because rsize was decreased to the maximum supported by UDP. This patch fixes the problem by using v_bufobj.bo_bsize instead of f_iosize, since the latter is set to f_iosize when the vnode is allocated, but does not change for a given vnode when f_iosize changes.
Reported by: pjd Reviewed by: kib MFC after: 2 weeks
|
230559 |
26-Jan-2012 |
rmacklem |
Revert r230516, since it doesn't really fix the problem.
|
230552 |
25-Jan-2012 |
kib |
Fix remaining calls to cache_enter() in both NFS clients to provide appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for.
Reviewed by: jhb (previous version) MFC after: 2 weeks
|
230547 |
25-Jan-2012 |
jhb |
Add a timeout on positive name cache entries in the NFS client. That is, we will only trust a positive name cache entry for a specified amount of time before falling back to a LOOKUP RPC, even if the ctime for the file handle matches the cached copy in the name cache entry. The timeout is configured via a new 'nametimeo' mount option and defaults to 60 seconds. It may be set to zero to disable positive name caching entirely.
Reviewed by: rmacklem MFC after: 1 week
|
230516 |
25-Jan-2012 |
rmacklem |
If a mount -u is done to either NFS client that switches it from TCP to UDP and the rsize/wsize/readdirsize is greater than NFS_MAXDGRAMDATA, it is possible for a thread doing an I/O RPC to get stuck repeatedly doing retries. This happens because the RPC will use a resize/wsize/readdirsize that won't work for UDP and, as such, it will keep failing indefinitely. This patch returns an error for this case, to avoid the problem. A discussion on freebsd-fs@ seemed to indicate that returning an error was preferable to silently ignoring the "udp"/"mntudp" option. This problem was discovered while investigating a problem reported by pjd@ via email.
MFC after: 2 weeks
|
230394 |
20-Jan-2012 |
jhb |
Close a race in NFS lookup processing that could result in stale name cache entries on one client when a directory was renamed on another client. The root cause for the stale entry being trusted is that each per-vnode nfsnode structure has a single 'n_ctime' timestamp used to validate positive name cache entries. However, if there are multiple entries for a single vnode, they all share a single timestamp. To fix this, extend the name cache to allow filesystems to optionally store a timestamp value in each name cache entry. The NFS clients now fetch the timestamp associated with each name cache entry and use that to validate cache hits instead of the timestamps previously stored in the nfsnode. Another part of the fix is that the NFS clients now use timestamps from the post-op attributes of RPCs when adding name cache entries rather than pulling the timestamps out of the file's attribute cache. The latter is subject to races with other lookups updating the attribute cache concurrently. Some more details: - Add a variant of nfsm_postop_attr() to the old NFS client that can return a vattr structure with a copy of the post-op attributes. - Handle lookups of "." as a special case in the NFS clients since the name cache does not store name cache entries for ".", so we cannot get a useful timestamp. It didn't really make much sense to recheck the attributes on the the directory to validate the namecache hit for "." anyway. - ABI compat shims for the name cache routines are present in this commit so that it is safe to MFC.
MFC after: 2 weeks
|
230345 |
20-Jan-2012 |
rmacklem |
Martin Cracauer reported a problem to freebsd-current@ under the subject "Data corruption over NFS in -current". During investigation of this, I came across an ugly bogusity in the new NFS client where it replaced the cr_uid with the one used for the mount. This was done so that "system operations" like the NFSv4 Renew would be performed as the user that did the mount. However, if any other thread shares the credential with the one doing this operation, it could do an RPC (or just about anything else) as the wrong cr_uid. This patch fixes the above, by using the mount credentials instead of the one provided as an argument for this case. It appears to have fixed Martin's problem. This patch is needed for NFSv4 mounts and NFSv3 mounts against some non-FreeBSD servers that do not put post operation attributes in the NFSv3 Statfs RPC reply.
Tested by: Martin Cracauer (cracauer at cons.org) Reviewed by: jhb MFC after: 2 weeks
|
230304 |
18-Jan-2012 |
rea |
Subject: NULLFS: properly destroy node hash
Use hashdestroy() instead of naive free().
Approved by: kib MFC after: 2 weeks
|
230252 |
17-Jan-2012 |
kevlo |
Return EOPNOTSUPP since we only support update mounts for NFS export.
Spotted by: trociny
|
230249 |
17-Jan-2012 |
mckusick |
Make sure all intermediate variables holding mount flags (mnt_flag) and that all internal kernel calls passing mount flags are declared as uint64_t so that flags in the top 32-bits are not lost.
MFC after: 2 weeks
|
230208 |
16-Jan-2012 |
kevlo |
Add nfs export support to tmpfs(5)
Reviewed by: kib
|
230180 |
16-Jan-2012 |
alc |
When tmpfs_write() resets an extended file to its original size after an error, we want tmpfs_reg_resize() to ignore I/O errors and unconditionally update the file's size.
Reviewed by: kib MFC after: 3 weeks
|
230145 |
15-Jan-2012 |
trociny |
Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always want to read strings completely to know the actual size.
As a side effect it fixes the issue with kern.proc.args and kern.proc.env sysctls, which didn't return the size of available data when calling sysctl(3) with the NULL argument for oldp.
Note, in get_ps_strings(), which does actual work for proc_getargv() and proc_getenvv(), we still have a safety limit on the size of data read in case of a corrupted procces stack.
Suggested by: kib MFC after: 3 days
|
230132 |
15-Jan-2012 |
uqs |
Convert files to UTF-8
|
230120 |
14-Jan-2012 |
alc |
Neither tmpfs_nocacheread() nor tmpfs_mappedwrite() needs to call vm_object_pip_{add,subtract}() on the swap object because the swap object can't be destroyed while the vnode is exclusively locked. Moreover, even if the swap object could have been destroyed during tmpfs_nocacheread() and tmpfs_mappedwrite() this code is broken because vm_object_pip_subtract() does not wake up the sleeping thread that is trying to destroy the swap object.
Free invalid pages after an I/O error. There is no virtue in keeping them around in the swap object creating more work for the page daemon. (I believe that any non-busy page in the swap object will now always be valid.)
vm_pager_get_pages() does not return a standard errno, so its return value should not be returned by tmpfs without translation to an errno value.
There is no reason for the wakeup on vpg in tmpfs_mappedwrite() to occur with the swap object locked.
Eliminate printf()s from tmpfs_nocacheread() and tmpfs_mappedwrite(). (The swap pager already spam your console if data corruption is imminent.)
Reviewed by: kib MFC after: 3 weeks
|
230100 |
14-Jan-2012 |
rmacklem |
Tai Horgan reported via email that there were two places in the new NFSv4 server where the code follows the wrong list. Fortunately, for these fairly rare cases, the lc_stateid[] lists are normally empty. This patch fixes the code to follow the correct list.
Reported by: tai.horgan at isilon.com Discussed with: zack MFC after: 2 weeks
|
229956 |
11-Jan-2012 |
rmacklem |
jwd@ reported via email that the "CacheSize" field reported by "nfsstat -e -s" would go negative after using the "-z" option to zero out the stats. This patch fixes that by not zeroing out the srvcache_size field for "-z", since it is the size of the cache and not a counter.
MFC after: 2 weeks
|
229821 |
08-Jan-2012 |
alc |
Correct an error of omission in the implementation of the truncation operation on POSIX shared memory objects and tmpfs. Previously, neither of these modules correctly handled the case in which the new size of the object or file was not a multiple of the page size. Specifically, they did not handle partial page truncation of data stored on swap. As a result, stale data might later be returned to an application.
Interestingly, a data inconsistency was less likely to occur under tmpfs than POSIX shared memory objects. The reason being that a different mistake by the tmpfs truncation operation helped avoid a data inconsistency. If the data was still resident in memory in a PG_CACHED page, then the tmpfs truncation operation would reactivate that page, zero the truncated portion, and leave the page pinned in memory. More precisely, the benevolent error was that the truncation operation didn't add the reactivated page to any of the paging queues, effectively pinning the page. This page would remain pinned until the file was destroyed or the page was read or written. With this change, the page is now added to the inactive queue.
Discussed with: jhb Reviewed by: kib (an earlier version) MFC after: 3 weeks
|
229802 |
08-Jan-2012 |
rmacklem |
opt_inet6.h was missing from some files in the new NFS subsystem. The effect of this was, for clients mounted via inet6 addresses, that the DRC cache would never have a hit in the server. It also broke NFSv4 callbacks when an inet6 address was the only one available in the client. This patch fixes the above, plus deletes opt_inet6.h from a couple of files it is not needed for.
MFC after: 2 weeks
|
229694 |
06-Jan-2012 |
jh |
r222004 changed sbuf_finish() to not clear the buffer error status. As a consequence sbuf_len() will return -1 for buffers which had the error status set prior to sbuf_finish() call. This causes a problem in pfs_read() which purposely uses a fixed size sbuf to discard bytes which are not needed to fulfill the read request.
Work around the problem by using the full buffer length when sbuf_finish() indicates an overflow. An overflowed sbuf with fixed size is always full.
PR: kern/163076 Approved by: des MFC after: 2 weeks
|
229692 |
06-Jan-2012 |
jh |
Check the return value of sbuf_finish() in pfs_readlink() and return ENAMETOOLONG if the buffer overflowed.
Approved by: des MFC after: 2 weeks
|
229600 |
05-Jan-2012 |
dim |
In sys/fs/nullfs/null_subr.c, in a KASSERT, output the correct vnode pointer 'lowervp' instead of 'vp', which is uninitialized at that point.
Reviewed by: kib MFC after: 1 week
|
229431 |
03-Jan-2012 |
kib |
Do the vput() for the lowervp in the null_nodeget() for error case too. Several callers of null_nodeget() did the cleanup itself, but several missed it, most prominent being null_bypass(). Remove the cleanup from the callers, now null_nodeget() handles lowervp free itself.
Reported and tested by: pho MFC after: 1 week
|
229428 |
03-Jan-2012 |
kib |
Document the state of the lowervp vnode for null_nodeget().
Tested by: pho MFC after: 1 week
|
229407 |
03-Jan-2012 |
pfg |
Minor cleanups to ntfs code
bzero -> memset rename variables to avoid shadowing.
PR: 142401 Obtained from: NetBSD Approved by jhb (mentor)
|
229363 |
03-Jan-2012 |
alc |
Don't pass VM_ALLOC_ZERO to vm_page_grab() in tmpfs_mappedwrite() and tmpfs_nocacheread(). It is both unnecessary and a pessimization. It results in either the page being zeroed twice or zeroed first and then overwritten by an I/O operation.
MFC after: 3 weeks
|
229272 |
02-Jan-2012 |
ed |
Use strchr() and strrchr().
It seems strchr() and strrchr() are used more often than index() and rindex(). Therefore, simply migrate all kernel code to use it.
For the XFS code, remove an empty line to make the code identical to the code in the Linux kernel.
|
229200 |
01-Jan-2012 |
ed |
Migrate ufs and ext2fs from skpc() to memcchr().
While there, remove a useless check from the code. memcchr() always returns characters unequal to 0xff in this case, so inosused[i] ^ 0xff can never be equal to zero. Also, the fact that memcchr() returns a pointer instead of the number of bytes until the end, makes conversion to an offset far more easy.
|
228864 |
24-Dec-2011 |
kevlo |
Discard local array based on return values.
Pointed out by: uqs Found with: Coverity Prevent(tm) CID: 10089
|
228827 |
23-Dec-2011 |
rmacklem |
During investigation of an NFSv4 client crash reported by glebius@, jhb@ spotted that nfscl_getstateid() might modify credentials when called from nfsrpc_read() for the case where p != NULL, whereas nfsrpc_read() only did a crdup() to get new credentials for p == NULL. This bug was introduced by r195510, since pre-r195510 nfscl_getstateid() only modified credentials for the p == NULL case. This patch modifies nfsrpc_read()/nfsrpc_write() so that they do crdup() for the p != NULL case. It is conceivable that this bug caused the crash reported by glebius@, but that will not be determined for some time, since the crash occurred after about 1month of operation.
Tested by: glebius Reviewed by: jhb MFC after: 2 weeks
|
228796 |
22-Dec-2011 |
kevlo |
Discarding local array based on return values
|
228757 |
21-Dec-2011 |
rmacklem |
jwd@ reported a problem via email where the old NFS client would get a reply of EEXIST from an NFS server when a Mkdir RPC was retried, for an NFS over UDP mount. Upon investigation, it was found that the client was retransmitting the Mkdir RPC request over UDP, but with a different xid. As such, the retransmitted message would miss the Duplicate Request Cache in the server, causing it to reply EEXIST. The kernel client side UDP rpc code has two timers. The first one causes a retransmit using the same xid and socket and was set to a fixed value of 3seconds. (The default can be overridden via CLSET_RETRY_TIMEOUT.) The second one creates a new socket and xid and should be larger than the first. However, both NFS clients were setting the second timer to nm_timeo ("timeout=<value>" mount argument), which defaulted to 1second, so the first timer would never time out. This patch fixes both NFS clients so that they set the first timer using nm_timeo and makes the second timer larger than the first one.
Reported by: jwd Tested by: jwd Reviewed by: jhb MFC after: 2 weeks
|
228583 |
16-Dec-2011 |
pfg |
Style cleanups by jh@. Fix a comment from the previous commit. Use M_ZERO instead of bzero() in ext2_vfsops.c Add include guards from PR.
PR: 162564 Approved by: jhb (mentor) MFC after: 2 weeks
|
228560 |
16-Dec-2011 |
rmacklem |
Patch the new NFS server in a manner analagous to r228520 for the old NFS server, so that it correctly handles a count == 0 argument for Commit.
PR: kern/118126 MFC after: 2 weeks
|
228539 |
15-Dec-2011 |
pfg |
Bring in reallocblk to ext2fs.
The feature has been standard for a while in UFS as a means to reduce fragmentation, therefore maintaining consistent performance with filesystem aging. This is also very similar to what ext4 calls "delayed allocation".
In his 2010 GSoC, Zheng Liu ported and benchmarked the missing FANCY_REALLOC code to find more consistent performance improvements than with the preallocation approach.
PR: 159233 Author: Zheng Liu <gnehzuil AT SPAMFREE gmail DOT com> Sponsored by: Google Inc. Approved by: jhb (mentor) MFC after: 2 weeks
|
228507 |
14-Dec-2011 |
pfg |
Merge ext2_readwrite.c into ext2_vnops.c as done in UFS in r101729.
This removes the obfuscations mentioned in ext2_readwrite and places the clustering funtion in a location similar to other UFS-based implementations.
No performance or functional changeses are expected from this move.
PR: kern/159232 Suggested by: bde Approved by: jhb (mentor) MFC after: 2 weeks
|
228361 |
09-Dec-2011 |
jhb |
Explicitly use curthread while manipulating td_fpop during last close of a devfs file descriptor in devfs_close_f(). The passed in td argument may be NULL if the close was invoked by garbage collection of open file descriptors in pending control messages in the socket buffer of a UNIX domain socket after it was closed.
PR: kern/151758 Submitted by: Andrey Shidakov andrey shidakov ru Submitted by: Ruben van Staveren ruben verweg com Reviewed by: kib MFC after: 2 weeks
|
228263 |
04-Dec-2011 |
kib |
Initialize fifoinfo fi_wgen field on open. The only important is the difference between fi_wgen and f_seqcount, so the change is purely cosmetic, but it makes the code easier to understand.
Submitted by: gianni MFC after: 2 weeks
|
228260 |
04-Dec-2011 |
rmacklem |
This patch adds a sysctl to the NFSv4 server which optionally disables the check for a UTF-8 compliant file name. Enabling this sysctl results in an NFSv4 server that is non-RFC3530 compliant, therefore it is not enabled by default. However, enabling this sysctl results in NFSv3 compatible behaviour and fixes the problem reported by "dan at sunsaturn.com" to freebsd-current@ on Nov. 14, 2011 under the subject "NFSV4 readlink_stat".
Tested by: dan at sunsaturn.com Reviewed by: zack MFC after: 2 weeks
|
228217 |
03-Dec-2011 |
rmacklem |
Post r223774, the NFSv4 client no longer has multiple instances of the same lock_owner4 string. As such, the handling of cleanup of lock_owners could be simplified. This simplification permitted the client to do a ReleaseLockOwner operation when the process that the lock_owner4 string represents, has exited. This permits the server to release any storage related to the lock_owner4 string before the associated open is closed. Without this change, it is possible to exhaust a server's storage when a long running process opens a file and then many child processes do locking on the file, because the open doesn't get closed. A similar patch was applied to the Linux NFSv4 client recently so that it wouldn't exhaust a server's storage.
Reviewed by: zack MFC after: 2 weeks
|
228185 |
01-Dec-2011 |
jhb |
Enhance the sequential access heuristic used to perform readahead in the NFS server and reuse it for writes as well to allow writes to the backing store to be clustered. - Use a prime number for the size of the heuristic table (1017 is not prime). - Move the logic to locate a heuristic entry from the table and compute the sequential count out of VOP_READ() and into a separate routine. - Use the logic from sequential_heuristic() in vfs_vnops.c to update the seqcount when a sequential access is performed rather than just increasing seqcount by 1. This lets the clustering count ramp up faster. - Allow for some reordering of RPCs and if it is detected leave the current seqcount as-is rather than dropping back to a seqcount of 1. Also, when out of order access is encountered, cut seqcount in half rather than dropping it all the way back to 1 to further aid with reordering. - Fix the new NFS server to properly update the next offset after a successful VOP_READ() so that the readahead actually works.
Some of these changes came from an earlier patch by Bjorn Gronwall that was forwarded to me by bde@.
Discussed with: bde, rmacklem, fs@ Submitted by: Bjorn Gronwall (1, 4) MFC after: 2 weeks
|
228156 |
30-Nov-2011 |
kib |
Rename vm_page_set_valid() to vm_page_set_valid_range(). The vm_page_set_valid() is the most reasonable name for the m->valid accessor.
Reviewed by: attilio, alc
|
228023 |
27-Nov-2011 |
kevlo |
Add unicode support to ntfs
Obtained from: imura
|
227834 |
22-Nov-2011 |
trociny |
In procfs_doproccmdline() if arguments are not cashed read them from the process stack.
Suggested by: kib Reviewed by: kib Tested by: pho MFC after: 2 weeks
|
227822 |
22-Nov-2011 |
ivoras |
Avoid panics from recursive rename operations. Not a perfect patch but good enough for now.
PR: kern/159418 Submitted by: Gleb Kurtsou Reviewed by: kib MFC after: 1 month
|
227817 |
22-Nov-2011 |
kib |
Put all the messages from msdosfs under the MSDOSFS_DEBUG ifdef. They are confusing to user, and not informative for general consumption.
MFC after: 1 week
|
227809 |
22-Nov-2011 |
rmacklem |
This patch enables the new/default NFS server's use of shared vnode locking for read, readdir, readlink, getattr and access. It is hoped that this will improve server performance for these operations, since they will no longer be serialized for a given file/vnode.
|
227802 |
21-Nov-2011 |
delphij |
Improve the way to calculate available pages in tmpfs:
- Don't deduct wired pages from total usable counts because it does not make any sense. To make things worse, on systems where swap size is smaller than physical memory and use a lot of wired pages (e.g. ZFS), tmpfs can suddenly have free space of 0 because of this; - Count cached pages as available; [1] - Don't count inactive pages as available, technically we could but that might be too aggressive; [1]
[1] Suggested by kib@
MFC after: 1 week
|
227796 |
21-Nov-2011 |
rmacklem |
Clean up some cruft in the NFSv4 client left over from the OpenBSD port, so that it is more readable. No logic change is made by this commit.
MFC after: 2 weeks
|
227760 |
20-Nov-2011 |
rmacklem |
Add two arguments to the nfsrpc_rellockown() function in the NFSv4 client. This does not change the client's behaviour, but prepares the code so that nfsrpc_rellockown() can be called elsewhere in a future commit.
MFC after: 2 weeks
|
227744 |
20-Nov-2011 |
rmacklem |
Since the nfscl_cleanup() function isn't used by the FreeBSD NFSv4 client, delete the code and fix up the related comments. This should not have any functional effect on the client.
MFC after: 2 weeks
|
227743 |
20-Nov-2011 |
rmacklem |
Post r223774 the NFSv4 client never uses the linked list with the head nfsc_defunctlockowner. This patch simply removes the code that loops through this always empty list, since the code no longer does anything useful. It should not have any effect on the client's behaviour.
MFC after: 2 weeks
|
227697 |
19-Nov-2011 |
kib |
Existing VOP_VPTOCNP() interface has a fatal flow that is critical for nullfs. The problem is that resulting vnode is only required to be held on return from the successfull call to vop, instead of being referenced.
Nullfs VOP_INACTIVE() method reclaims the vnode, which in combination with the VOP_VPTOCNP() interface means that the directory vnode returned from VOP_VPTOCNP() is reclaimed in advance, causing vn_fullpath() to error with EBADF or like.
Change the interface for VOP_VPTOCNP(), now the dvp must be referenced. Convert all in-tree implementations of VOP_VPTOCNP(), which is trivial, because vhold(9) and vref(9) are similar in the locking prerequisites. Out-of-tree fs implementation of VOP_VPTOCNP(), if any, should have no trouble with the fix.
Tested by: pho Reviewed by: mckusick MFC after: 3 weeks (subject of re approval)
|
227696 |
19-Nov-2011 |
kib |
Do not use NULLVPTOLOWERVP() in the null_print(). If diagnostic is compiled in, and show vnode is used from ddb on the faulty nullfs vnode, we get panic instead of vnode dump.
MFC after: 1 week
|
227695 |
19-Nov-2011 |
kib |
Use the plain panic calls, without additional printing around them. The debugger and dumping support is adequate.
Tested by: pho MFC after: 1 week
|
227650 |
18-Nov-2011 |
kevlo |
Add unicode support to msdosfs and smbfs; original pathes from imura, bug fixes by Kuan-Chung Chiu <buganini at gmail dot com>.
Tested by me in production for several days at work.
|
227576 |
16-Nov-2011 |
kib |
Fix build, use %d for int value formatting.
|
227550 |
16-Nov-2011 |
pho |
Handle invalid large values for getdirentries(2) data buffer size.
In collaboration with: kib Reviewed by: des Reported by: The iknowthis syscall fuzzer. MFC after: 1 week
|
227543 |
15-Nov-2011 |
rmacklem |
Modify the new NFS client so that nfs_fsync() only calls ncl_flush() for regular files. Since other file types don't write into the buffer cache, calling ncl_flush() is almost a no-op. However, it does clear the NMODIFIED flag and this shouldn't be done by nfs_fsync() for directories.
MFC after: 2 weeks
|
227527 |
15-Nov-2011 |
pho |
Removed extra PRELE() call.
MFC after: 1 week
|
227517 |
15-Nov-2011 |
rmacklem |
Move the setting of the default value for nm_wcommitsize to before the nfs_decode_args() call in the new NFS client, so that a specfied command line value won't be overwritten. Also, modify the calculation for small values of desiredvnodes to avoid an unusually large value or a divide by zero crash. It seems that the default value for nm_wcommitsize is very conservative and may need to change at some time.
PR: kern/159351 Submitted by: onwahe at gmail.com (earlier version) Reviewed by: jhb MFC after: 2 weeks
|
227507 |
14-Nov-2011 |
jhb |
Finish making 'wcommitsize' an NFS client mount option.
Reviewed by: rmacklem MFC after: 1 week
|
227504 |
14-Nov-2011 |
jhb |
Sync with the old NFS client: Remove an obsolete comment.
|
227494 |
14-Nov-2011 |
rmacklem |
Since NFSv4 byte range locking only works for regular files, add a sanity check for the vnode type to the NFSv4 client.
MFC after: 2 weeks
|
227493 |
13-Nov-2011 |
rmacklem |
Move the assignment of default values for some mount options to before the nfs_decode_args() call in the new NFS client, so they don't overwrite the value specified on the command line.
MFC after: 2 weeks
|
227489 |
13-Nov-2011 |
eadler |
- fix duplicate "a a" in some comments
Submitted by: eadler Approved by: simon MFC after: 3 days
|
227393 |
09-Nov-2011 |
kib |
Lock the thread lock around block that retrieves td_wmesg. Otherwise, procfs could see a thread with assigned td_wchan but still NULL td_wmesg.
Reported and tested by: pho MFC after: 1 week
|
227310 |
07-Nov-2011 |
marcel |
Don astbestos garment and remove the warning about TMPFS being experimental -- highly experimental even. So far the closest to a bug in TMPFS that people have gotten to relates to how ZFS can take away from the memory that TMPFS needs. One can argue that such is not a bug in TMPFS. Irrespective, even if there is a bug here and there in TMPFS, it's not in our own advantage to scare people away from using TMPFS. I for one have been using it, even with ZFS, very successfully.
|
227309 |
07-Nov-2011 |
ed |
Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
|
227293 |
07-Nov-2011 |
ed |
Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.
This means that their use is restricted to a single C file.
|
227267 |
06-Nov-2011 |
ed |
Remove MALLOC_DECLAREs of nonexisting malloc-pools.
After careful grepping, it seems none of these pools can be found in our source tree. They are not in use, nor are they defined.
|
227104 |
05-Nov-2011 |
kib |
Fix typo.
MFC after: 3 days
|
227069 |
04-Nov-2011 |
jhb |
Move the cleanup of f_cdevpriv when the reference count of a devfs file descriptor drops to zero out of _fdrop() and into devfs_close_f() as it is only relevant for devfs file descriptors.
Reviewed by: kib MFC after: 1 week
|
227062 |
03-Nov-2011 |
kib |
Fix kernel panic when d_fdopen csw method is called for NULL fp. This may happen when kernel consumer calls VOP_OPEN().
Reported by: Tavis Ormandy <taviso cmpxchg8b com> through delphij MFC after: 3 days
|
226987 |
01-Nov-2011 |
pho |
Added missing cache purge of from argument for rename().
Reported by: Anton Yuzhaninov <citrin citrin ru> In collaboration with: kib MFC after: 1 week
|
226688 |
24-Oct-2011 |
kib |
The use of VOP_ISLOCKED() without a check for the return values can cause false positives. Replace the #ifdef block with the proper ASSERT_VOP_UNLOCKED() assert.
Tested by: pho MFC after: 1 week
|
226687 |
24-Oct-2011 |
kib |
The only possible error return from null_nodeget() is due to insmntque1 failure (the getnewvnode cannot return an error). In this case, the null_insmntque_dtr() already unlocked the reclaimed vnode, so VOP_UNLOCK() in the nullfs_mount() after null_nodeget() failure is wrong.
Tested by: pho MFC after: 1 week
|
226686 |
24-Oct-2011 |
kib |
The covered vnode must be reloced if it was unlocked. Remove VOP_ISLOCKED test because of this and also because it can lead to false positives.
Tested by: pho MFC after: 1 week
|
226681 |
24-Oct-2011 |
pho |
Only unlock if the lock is exclusive.
Reported by: Subbsd <subbsd gmail com> Discussed with: kib
|
226497 |
18-Oct-2011 |
des |
Trace attempts to open a portal device.
Ceterum censeo portalfs esse delendam.
|
226234 |
10-Oct-2011 |
trasz |
Make unionfs also clear VAPPEND when clearing VWRITE, since VAPPEND is just a modifier for VWRITE.
Submitted by: rmacklem
|
226041 |
05-Oct-2011 |
kib |
Export devfs inode number allocator for the kernel consumers.
Reviewed by: jhb MFC after: 2 weeks
|
225617 |
16-Sep-2011 |
kmacy |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls.
Reviewed by: rwatson Approved by: re (bz)
|
225418 |
06-Sep-2011 |
kib |
Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs.
Document the changes to flags field to only require the page lock.
Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced.
Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)
|
225356 |
03-Sep-2011 |
rmacklem |
Fix the NFS servers so that they can do a Lookup of "..", which requires that ni_strictrelative be set to 0, post-r224810.
Tested by: swills (earlier version), geo dot liaskos at gmail.com Approved by: re (kib)
|
225049 |
20-Aug-2011 |
rmacklem |
Fix the NFSv4 server so that it returns NFSERR_SYMLINK when an attempt to do an Open operation on any type of file other than VREG is done. A recent discussion on the IETF working group's mailing list (nfsv4@ietf.org) decided that NFSERR_SYMLINK should be returned for all non-regular files and not just symlinks, so that the Linux client would work correctly. This change does not affect the FreeBSD NFSv4 client and is not believed to have a negative effect on other NFSv4 clients.
Reviewed by: zkirsch Approved by: re (kib) MFC after: 2 weeks
|
224915 |
16-Aug-2011 |
kib |
Do not return success and a string "unknown" when vn_fullpath() was unable to resolve the path of the text vnode of the process. The behaviour is very confusing for any consumer of the procfs, in particular, java.
Reported and tested by: bf MFC after: 2 weeks Approved by: re (bz)
|
224914 |
16-Aug-2011 |
kib |
Add the fo_chown and fo_chmod methods to struct fileops and use them to implement fchown(2) and fchmod(2) support for several file types that previously lacked it. Add MAC entries for chown/chmod done on posix shared memory and (old) in-kernel posix semaphores.
Based on the submission by: glebius Reviewed by: rwatson Approved by: re (bz)
|
224911 |
16-Aug-2011 |
jonathan |
Fix a merge conflict.
r224086 added "goto out"-style error handling to nfssvc_nfsd(), in order to reliably call NFSEXITCODE() before returning. Our Capsicum changes, based on the old "return (error)" model, did not merge nicely.
Approved by: re (kib), mentor (rwatson) Sponsored by: Google Inc
|
224778 |
11-Aug-2011 |
rwatson |
Second-to-last commit implementing Capsicum capabilities in the FreeBSD kernel for FreeBSD 9.0:
Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op.
Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions.
In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit.
Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent.
Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
|
224743 |
09-Aug-2011 |
kib |
Do not update mountpoint generation counter to the value which was not yet acted upon by devfs_populate().
Submitted by: Kohji Okuno <okuno.kohji jp panasonic com> Approved by: re (bz) MFC after: 1 week
|
224637 |
03-Aug-2011 |
zack |
Fix an NFS server issue where it was not correctly setting the eof flag when a READ had hit the end of the file. Also, clean up some cruft in the code.
Approved by: re (kib) Reviewed by: rmacklem MFC after: 2 weeks
|
224606 |
02-Aug-2011 |
rmacklem |
Fix a LOR in the NFS client which could cause a deadlock. This was reported to the mailing list freebsd-net@freebsd.org on July 21, 2011 under the subject "LOR with nfsclient sillyrename". The LOR occurred when nfs_inactive() called vrele(sp->s_dvp) while holding the vnode lock on the file in s_dvp. This patch modifies the client so that it performs the vrele(sp->s_dvp) as a separate task to avoid the LOR. This fix was discussed with jhb@ and kib@, who both proposed variations of it.
Tested by: pho, jlott at averesystems.com Submitted by: jhb (earlier version) Reviewed by: kib Approved by: re (kib) MFC after: 2 weeks
|
224554 |
31-Jul-2011 |
rmacklem |
Fix rename in the new NFS server so that it does not require a recursive vnode lock on the directory for the case where the new file name is in the same directory as the old one. The patch handles this as a special case, recognized by the new directory having the same file handle as the old one and just VREF()s the old dir vnode for this case, instead of doing a second VFS_FHTOVP() to get it. This is required so that the server will work for file systems like msdosfs, that do not support recursive vnode locking. This problem was discovered during recent testing by pho@ when exporting an msdosfs file system via the new NFS server.
Tested by: pho Reviewed by: zkirsch Approved by: re (kib) MFC after: 2 weeks
|
224532 |
30-Jul-2011 |
rmacklem |
The new NFS client failed to vput() the new vnode if a setattr failed after the file was created in nfs_create(). This would probably only happen during a forced dismount. The old NFS client does have a vput() for this case. Detected by pho during recent testing, where an open syscall returned with a vnode still locked.
Tested by: pho Approved by: re (kib) MFC after: 2 weeks
|
224290 |
24-Jul-2011 |
mckusick |
This update changes the mnt_flag field in the mount structure from 32 bits to 64 bits and eliminates the unused mnt_xflag field. The existing mnt_flag field is completely out of bits, so this update gives us room to expand. Note that the f_flags field in the statfs structure is already 64 bits, so the expanded mnt_flag field can be exported without having to make any changes in the statfs structure.
Approved by: re (bz)
|
224121 |
17-Jul-2011 |
zack |
Revert revision 224079 as Rick pointed out that I would be calling VOP_PATHCONF without the vnode lock held.
Implicitly approved by: zml (mentor)
|
224117 |
16-Jul-2011 |
rmacklem |
The new NFSv4 client handled NFSERR_GRACE as a fatal error for the remove and rename operations. Some NFSv4 servers will report NFSERR_GRACE for these operations. This patch changes the behaviour of the client so that it handles NFSERR_GRACE like NFSERR_DELAY for non-state related operations like remove and rename. It also exempts the delegreturn operation from handling within newnfs_request() for NFSERR_DELAY/NFSERR_GRACE so that it can handle NFSERR_GRACE in the same manner as before. This problem was resolved thanks to discussion with bfields at fieldses.org. The problem was identified at the recent NFSv4 ineroperability bakeathon.
MFC after: 2 weeks
|
224086 |
16-Jul-2011 |
zack |
Add DEXITCODE plumbing to NFS.
Isilon has the concept of an in-memory exit-code ring that saves the last exit code of a function and allows for stack tracing. This is very helpful when debugging tough issues.
This patch is essentially a no-op for BSD at this point, until we upstream the dexitcode logic itself. The patch adds DEXITCODE calls to every NFS function that returns an errno error code. A number of code paths were also reorganized to have single exit paths, to reduce code duplication.
Submitted by: David Kwan <dkwan@isilon.com> Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
224083 |
16-Jul-2011 |
zack |
Simple find/replace of VOP_ISLOCKED -> NFSVOPISLOCKED. This is done so that NFSVOPISLOCKED can be modified later to add enhanced logging and assertions.
Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
224082 |
16-Jul-2011 |
zack |
Simple find/replace of VOP_UNLOCK -> NFSVOPUNLOCK. This is done so that NFSVOPUNLOCK can be modified later to add enhanced logging and assertions.
Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
224081 |
16-Jul-2011 |
zack |
Simple find/replace of vn_lock -> NFSVOPLOCK. This is done so that NFSVOPLOCK can be modified later to add enhanced logging and assertions.
Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
224080 |
16-Jul-2011 |
zack |
Remove unnecessary thread pointer from VOPLOCK macros and current users.
Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
224079 |
16-Jul-2011 |
zack |
Change loadattr and fillattr to ask the file system for the pathconf variable.
Small modification where VOP_PATHCONF was being called directly.
Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
224078 |
16-Jul-2011 |
zack |
Move nfsvno_pathconf to be accessible to sys/fs/nfs; no functionality change.
Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
224077 |
16-Jul-2011 |
zack |
Small acl patch to return the aclerror that comes back from nfsrv_dissectacl(). This fixes a problem where ATTRNOTSUPP was being returned instead of BADOWNER.
Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
223988 |
13-Jul-2011 |
kib |
While fixing the looping of a thread while devfs vnode is reclaimed, r179247 introduced a possibility of devfs_allocv() returning spurious ENOENT. If the vnode is selected by vnlru daemon for reclamation, then devfs_allocv() can get ENOENT from vget() due to devfs_close() dropping vnode lock around the call to cdevsw d_close method.
Use LK_RETRY in the vget() call, and do some part of the devfs_reclaim() work in devfs_allocv(), clearing vp->v_data and de->de_vnode. Retry the allocation of the vnode, now with de->de_vnode == NULL.
The check vp->v_data == NULL at the start of devfs_close() cannot be affected by the change, since vnode lock must be held while VI_DOOMED is set, and only dropped after the check.
Reported and tested by: Kohji Okuno <okuno.kohji jp panasonic com> Reviewed by: attilio MFC after: 3 weeks
|
223971 |
13-Jul-2011 |
rmacklem |
r222389 introduced a case where the NFSv4 client could loop in nfscl_getcl() when a forced dismount is in progress, because nfsv4_lock() will return 0 without sleeping when MNTK_UNMOUNTF is set. This patch fixes it so it won't loop calling nfsv4_lock() for this case.
MFC after: 2 weeks
|
223843 |
07-Jul-2011 |
jonathan |
Make a comment more accurate.
This comment refers to CAP_NT_SMBS, which does not exist; it should refer to SMB_CAP_NT_SMBS. Fixing this comment makes it easier for people interested in Capsicum to grep around for capability rights, whose identifiers are of the form 'CAP_[A-Z_]'.
Approved by: mentor (rwatson), re (Capsicum blanket) Sponsored by: Google Inc
|
223774 |
04-Jul-2011 |
rmacklem |
The algorithm used by nfscl_getopen() could have resulted in multiple instances of the same lock_owner when a process both inherited an open file descriptor plus opened the same file itself. Since some NFSv4 servers cannot handle multiple instances of the same lock_owner string, this patch changes the algorithm used by nfscl_getopen() in the new NFSv4 client to keep that from happening. The new algorithm is simpler, since there is no longer any need to ascend the process's parentage tree because all NFSv4 Closes for a file are done at VOP_INACTIVE()/VOP_RECLAIM(), making the Opens indistinct w.r.t. use with Lock Ops. This problem was discovered at the recent NFSv4 interoperability Bakeathon.
MFC after: 2 weeks
|
223747 |
03-Jul-2011 |
rmacklem |
Modify the new NFSv4 client so that it appends a file handle to the lock_owner4 string that goes on the wire. Also, add code to do a ReleaseLockOwner Op on the lock_owner4 string before a Close. Apparently not all NFSv4 servers handle multiple instances of the same lock_owner4 string, at least not in a compatible way. This patch avoids having multiple instances, except for one unusual case, which will be fixed by a future commit. Found at the recent NFSv4 interoperability Bakeathon.
Tested by: tdh at excfb.com MFC after: 2 weeks
|
223677 |
29-Jun-2011 |
alc |
Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages.
This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages.
Update all of the existing assertions on pmap_remove_all() to reflect this change.
Reviewed by: kib
|
223657 |
28-Jun-2011 |
rmacklem |
Fix the new NFSv4 client so that it doesn't fill the cached mode attribute in as 0 when doing writes. The change adds the Mode attribute plus the others except Owner and Owner_group to the list requested by the NFSv4 Write Operation. This fixed a problem where an executable file built by "cc" would get mode 0111 instead of 0755 for some NFSv4 servers. Found at the recent NFSv4 interoperability Bakeathon.
Tested by: tdh at excfb.com MFC after: 2 weeks
|
223441 |
22-Jun-2011 |
rmacklem |
Plug an mbuf leak in the new NFS client that occurred when a server replied NFS3ERR_JUKEBOX/NFS4ERR_DELAY to an rpc. This affected both NFSv3 and NFSv4. Found during testing at the recent NFSv4 interoperability Bakeathon.
MFC after: 2 weeks
|
223436 |
22-Jun-2011 |
rmacklem |
Fix the new NFSv4 client so that it uses the same uid as was used for doing a mount when performing system operations on AUTH_SYS mounts. This resolved an issue when mounting a Linux server. Found during testing at the recent NFSv4 interoperability Bakeathon.
MFC after: 2 weeks
|
223373 |
21-Jun-2011 |
rmacklem |
Fix the new NFSv4 server so that it checks for VREAD_ACL when a client does a Getattr for an ACL and not VREAD_ATTRIBUTES. This was found during the recent NFSv4 interoperability Bakeathon.
MFC after: 2 weeks
|
223349 |
20-Jun-2011 |
rmacklem |
Fix the new NFSv4 server so that it only allows Lookup of directories and symbolic links when traversing non-exported file systems. Found during the recent NFSv4 interoperability Bakeathon.
MFC after: 2 weeks
|
223348 |
20-Jun-2011 |
rmacklem |
Fix the new NFSv4 server so that it allows Access and Readlink operations while traversing non-exported file systems. This is required for some non-FreeBSD clients to do NFSv4 mounts. Found during the recent NFSv4 interoperability Bakeathon.
MFC after: 2 weeks
|
223312 |
19-Jun-2011 |
rmacklem |
Fix a number of places where the new NFS server did not lock the mutex when manipulating rc_flag in the DRC cache. This is believed to fix a hung server that was reported to the freebsd-fs@ list on June 9 under the subject heading "New NFS server stress test hang", where all the threads were waiting for the RC_LOCKED flag to clear.
Tested by: jwd at slowblink.com MFC after: 2 weeks
|
223309 |
19-Jun-2011 |
rmacklem |
Fix the kgssapi so that it can be loaded as a module. Currently the NFS subsystems use five of the rpcsec_gss/kgssapi entry points, but since it was not obvious which others might be useful, all nineteen were included. Basically the nineteen entry points are set in a structure called rpc_gss_entries and inline functions defined in sys/rpc/rpcsec_gss.h check for the entry points being non-NULL and then call them. A default value is returned otherwise. Requested by rwatson.
Reviewed by: jhb MFC after: 2 weeks
|
223280 |
18-Jun-2011 |
rmacklem |
Add DTrace support to the new NFS client. This is essentially cloned from the old NFS client, plus additions for NFSv4. A review of this code is in progress, however it was felt by the reviewer that it could go in now, before code slush. Any changes required by the review can be committed as bug fixes later.
|
222722 |
05-Jun-2011 |
rmacklem |
Add support for flock(2) locks to the new NFSv4 client. I think this should be ok, since the client now delays NFSv4 Close operations until VOP_INACTIVE()/VOP_RECLAIM(). As such, there should be no risk that the NFSv4 Open is closed while an associated byte range lock still exists.
Tested by: avg MFC after: 2 weeks
|
222719 |
05-Jun-2011 |
rmacklem |
The new NFSv4 client was erroneously using "p" instead of "p_leader" for the "id" for POSIX byte range locking. I think this would only have affected processes created by rfork(2) with the RFTHREAD flag specified. This patch fixes that by passing the "id" down through the various functions from nfs_advlock().
MFC after: 2 weeks
|
222718 |
05-Jun-2011 |
rmacklem |
Fix the new NFSv4 client so that it doesn't crash when a mount is done for a VIMAGE kernel.
Tested by: glz at hidden-powers dot com Reviewed by: bz MFC after: 2 weeks
|
222663 |
04-Jun-2011 |
rmacklem |
Modify the new NFS server so that the NFSv3 Pathconf RPC doesn't return an error when the underlying file system lacks support for any of the four _PC_xxx values used, by falling back to default values.
Tested by: avg MFC after: 2 weeks
|
222586 |
01-Jun-2011 |
kib |
In the VOP_PUTPAGES() implementations, change the default error from VM_PAGER_AGAIN to VM_PAGER_ERROR for the uwritten pages. Return VM_PAGER_AGAIN for the partially written page. Always forward at least one page in the loop of vm_object_page_clean().
VM_PAGER_ERROR causes the page reactivation and does not clear the page dirty state, so the write is not lost.
The change fixes an infinite loop in vm_object_page_clean() when the filesystem returns permanent errors for some page writes.
Reported and tested by: gavin Reviewed by: alc, rmacklem MFC after: 1 week
|
222540 |
31-May-2011 |
rmacklem |
Fix the new NFS client so that it doesn't do an NFSv3 Pathconf RPC for cases where the reply doesn't include the answer. This fixes a problem reported by avg@ where the NFSv3 Pathconf RPC would fail when "ls -l" did an lpathconf(2) for _PC_ACL_NFS4.
Tested by: avg MFC after: 2 weeks
|
222389 |
27-May-2011 |
rmacklem |
Fix the new NFS client so that it handles NFSv4 state correctly during a forced dismount. This required that the exclusive and shared (refcnt) sleep lock functions check for MNTK_UMOUNTF before sleeping, so that they won't block while nfscl_umount() is getting rid of the state. As such, a "struct mount *" argument was added to the locking functions. I believe the only remaining case where a forced dismount can get hung in the kernel is when a thread is already attempting to do a TCP connect to a dead server when the krpc client structure called nr_client is NULL. This will only happen just after a "mount -u" with options that force a new TCP connection is done, so it shouldn't be a problem in practice.
MFC after: 2 weeks
|
222329 |
26-May-2011 |
rmacklem |
Add a check for MNTK_UNMOUNTF at the beginning of nfs_sync() in the new NFS client so that a forced dismount doesn't get stuck in the VFS_SYNC() call that happens before VFS_UNMOUNT() in dounmount(). Additional changes are needed before forced dismounts will work.
MFC after: 2 weeks
|
222291 |
25-May-2011 |
rmacklem |
Add some missing mutex locking to the new NFS client.
MFC after: 2 weeks
|
222289 |
25-May-2011 |
rmacklem |
Fix the new NFS client so that it correctly sets the "must_commit" argument for a write RPC when it succeeds for the first one and fails for a subsequent RPC within the same call to the function. This makes it compatible with the old NFS client for this case.
MFC after: 2 weeks
|
222233 |
23-May-2011 |
rmacklem |
Set the MNT_NFS4ACLS flag for an NFSv4 client mount if the NFSv4 server supports it. Requested by trasz.
MFC after: 2 weeks
|
222187 |
22-May-2011 |
alc |
Eliminate duplicate #include's.
|
222167 |
22-May-2011 |
rmacklem |
Add a lock flags argument to the VFS_FHTOVP() file system method, so that callers can indicate the minimum vnode locking requirement. This will allow some file systems to choose to return a LK_SHARED locked vnode when LK_SHARED is specified for the flags argument. This patch only adds the flag. It does not change any file system to use it and all callers specify LK_EXCLUSIVE, so file system semantics are not changed.
Reviewed by: kib
|
222075 |
18-May-2011 |
rmacklem |
Add a sanity check for the existence of an "addr" option to both NFS clients. This avoids the crash reported by Sergey Kandaurov (pluknet@gmail.com) to the freebsd-fs@ list with subject "[old nfsclient] different nmount() args passed from mount vs mount_nfs" dated May 17, 2011.
Tested by: pluknet at gmail.com (old nfs client) MFC after: 2 weeks
|
221973 |
15-May-2011 |
rmacklem |
Change the sysctl naming for the old and new NFS clients to vfs.oldnfs.xxx and vfs.nfs.xxx respectively. This makes the default nfs client use vfs.nfs.xxx after r221124.
|
221867 |
14-May-2011 |
jhb |
Merge comments about converting directory entries to be more direct and concise.
Inspired by: Gleb Kurtsou
|
221615 |
08-May-2011 |
rmacklem |
Change the new NFS server so that it uses vfs.nfsd naming for its sysctls instead of vfs.newnfs. This separates the names from the ones used by the client.
|
221537 |
06-May-2011 |
rmacklem |
Set the initial value of maxfilesize to OFF_MAX in the new NFS client. It will then be reduced to whatever the server says it can support. There might be an argument that this could be one block larger, but since NFS is a byte granular system, I chose not to do that.
Suggested by: Matt Dillon Tested by: Daniel Braniss (earlier version) MFC after: 2 weeks
|
221523 |
06-May-2011 |
mav |
Increase NFS_TICKINTVL value from 10 to 500. Now that callout does useful things only once per second, so other 99 calls per second were useless and just don't allow idle system to sleep properly.
Reviewed by: rmacklem
|
221517 |
06-May-2011 |
rmacklem |
Change the new NFS server so that it returns 0 when the f_bavail or f_ffree fields of "struct statfs" are negative, since the values that go on the wire are unsigned and will appear to be very large positive values otherwise. This makes the handling of a negative f_bavail compatible with the old/regular NFS server.
MFC after: 2 weeks
|
221467 |
05-May-2011 |
rmacklem |
Fix the new NFS client so that it handles the 64bit fields that are now in "struct statfs" for NFSv3 and NFSv4. Since the ffiles value is uint64_t on the wire, I clip the value to INT64_MAX to avoid setting f_ffree negative.
Tested by: kib MFC after: 2 weeks
|
221462 |
04-May-2011 |
rmacklem |
Add a comment noting that the NFS code assumes that the values of error numbers in sys/errno.h will be the same as the ones specified by the NFS RFCs and that the code needs to be fixed if error numbers are changed in sys/errno.h.
Suggested by: Peter Jeremy MFC after: 2 weeks
|
221439 |
04-May-2011 |
rmacklem |
Add kernel support for NFSSVC_ZEROCLTSTATS and NFSSVC_ZEROSRVSTATS so that they can be used by nfsstat(1) to implement the "-z" option for the new NFS subsystem.
MFC after: 2 weeks
|
221438 |
04-May-2011 |
rmacklem |
Revert r221306, since NFSSVC_ZEROSTATS zero'd both client and server stats, when separate modifiers for NFSSVC_GETSTATS for each of client and server stats is what it required by nfsstat(1).
|
221436 |
04-May-2011 |
ru |
Implemented a mount option "nocto" that disables cache coherency checking at open time. It may improve performance for read-only NFS mounts. Use deliberately.
MFC after: 1 week Reviewed by: rmacklem, jhb (earlier version)
|
221429 |
04-May-2011 |
ru |
In ncl_printf(), call vprintf() instead of printf().
MFC after: 3 days
|
221306 |
01-May-2011 |
rmacklem |
Add the kernel support needed to zero out the nfsstats structure for the new NFS subsystem. This will be used by nfsstats.c to implement the "-z" option.
MFC after: 2 weeks
|
221261 |
30-Apr-2011 |
kib |
Clarify the comment.
MFC after: 1 week
|
221205 |
29-Apr-2011 |
rmacklem |
The build was broken by r221190 for 64bit arches like amd64. This patch fixes it.
MFC after: 2 weeks
|
221190 |
28-Apr-2011 |
rmacklem |
Fix the new NFS client so that it handles the "nfs_args" value in mnt_optnew. This is needed so that the old mount(2) syscall works and that is needed so that amd(8) works. The code was basically just cribbed from sys/nfsclient/nfs_vfsops.c with minor changes. This patch is mainly to fix the new NFS client so that amd(8) works with it. Thanks go to Craig Rodrigues for helping with this.
Tested by: Craig Rodrigues (for amd) MFC after: 2 weeks
|
221183 |
28-Apr-2011 |
jhb |
Update a comment since ext2fs does not use SU.
Reviewed by: kib
|
221176 |
28-Apr-2011 |
jhb |
The b_dep field of buffers is always empty for ext2fs, it is only used for SU in FFS.
Reported by: kib
|
221166 |
28-Apr-2011 |
jhb |
Sync with several changes in UFS/FFS: - 77115: Implement support for O_DIRECT. - 98425: Fix a performance issue introduced in 70131 that was causing reads before writes even when writing full blocks. - 98658: Rename the BALLOC flags from B_* to BA_* to avoid confusion with the struct buf B_ flags. - 100344: Merge the BA_ and IO_ flags so so that they may both be used in the same flags word. This merger is possible by assigning the IO_ flags to the low sixteen bits and the BA_ flags the high sixteen bits. - 105422: Fix a file-rewrite performance case. - 129545: Implement IO_INVAL in VOP_WRITE() by marking the buffer as "no cache". - Readd the DOINGASYNC() macro and use it to control asynchronous writes. Change i-node updates to honor DOINGASYNC() instead of always being synchronous. - Use a PRIV_VFS_RETAINSUGID check instead of checking cr_uid against 0 directly when deciding whether or not to clear suid and sgid bits.
Submitted by: Pedro F. Giffuni giffunip at yahoo
|
221139 |
27-Apr-2011 |
rmacklem |
Fix module names and dependencies so the NFS clients will load correctly as modules after r221124.
|
221128 |
27-Apr-2011 |
jhb |
Use a private EXT2_ROOTINO constant instead of redefining ROOTINO.
Submitted by: Pedro F. Giffuni giffunip at yahoo
|
221126 |
27-Apr-2011 |
jhb |
Various style fixes including using uint*_t instead of u_int*_t.
Submitted by: Pedro F. Giffuni giffunip at yahoo
|
221124 |
27-Apr-2011 |
rmacklem |
This patch changes head so that the default NFS client is now the new NFS client (which I guess is no longer experimental). The fstype "newnfs" is now "nfs" and the regular/old NFS client is now fstype "oldnfs". Although mounts via fstype "nfs" will usually work without userland changes, an updated mount_nfs(8) binary is needed for kernels built with "options NFSCL" but not "options NFSCLIENT". Updated mount_nfs(8) and mount(8) binaries are needed to do mounts for fstype "oldnfs". The GENERIC kernel configs have been changed to use options NFSCL and NFSD (the new client and server) instead of NFSCLIENT and NFSSERVER. For kernels being used on diskless NFS root systems, "options NFSCL" must be in the kernel config. Discussed on freebsd-fs@.
|
221066 |
26-Apr-2011 |
rmacklem |
Fix a kernel linking problem introduced by r221032, r221040 when building kernels that don't have "options NFS_ROOT" specified. I plan on moving the functions that use these data structures into the shared code in sys/nfs/nfs_diskless.c in a future commit. At that time, these definitions will no longer be needed in nfs_vfsops.c and nfs_clvfsops.c.
MFC after: 2 weeks
|
221040 |
25-Apr-2011 |
rmacklem |
Modify the experimental (newnfs) NFS client so that it uses the same diskless NFS root code as the regular client, which was moved to sys/nfs by r221032. This fixes the newnfs client so that it can do an NFSv3 diskless root file system.
MFC after: 2 weeks
|
221018 |
25-Apr-2011 |
rmacklem |
Fix the experimental NFS client so that it does not bogusly set the f_flags field of "struct statfs". This had the interesting effect of making the NFSv4 mounts "disappear" after r221014, since NFSMNT_NFSV4 and MNT_IGNORE became the same bit.
MFC after: 2 weeks
|
221014 |
25-Apr-2011 |
rmacklem |
Modify the experimental NFS client so that it uses the same "struct nfs_args" as the regular NFS client. This is needed so that the old mount(2) syscall will work and it makes sharing of the diskless NFS root code easier. Eary in the porting exercise I introduced a new revision of nfs_args, but didn't actually need it, thanks to nmount(2). I re-introduced the NFSMNT_KERB flag, since it does essentially the same thing and the old one would not have been used because it never worked. I also added a few new NFSMNT_xxx flags to sys/nfsclient/nfs_args.h that are used by the experimental NFS client.
MFC after: 2 weeks
|
220928 |
21-Apr-2011 |
rmacklem |
Remove the nm_mtx mutex locking from the test for nm_maxfilesize. This value rarely, if ever, changes and the nm_mtx mutex is locked/unlocked earlier in the function, which should be sufficient to avoid getting a stale cached value for it. There is a discussion w.r.t. what these tests should be, but I've left them basically the same as the regular NFS client for now.
Suggested by: pjd MFC after: 2 weeks
|
220921 |
21-Apr-2011 |
rmacklem |
Revert r220906, since the vp isn't always locked when nfscl_request() is called. It will need a more involved patch.
|
220906 |
20-Apr-2011 |
rmacklem |
Add a check for VI_DOOMED at the beginning of nfscl_request() so that it won't try and use vp->v_mount to do an RPC during a forced dismount. There needs to be at least one more kernel commit, plus a change to the umount(8) command before forced dismounts will work for the experimental NFS client.
MFC after: 2 weeks
|
220877 |
20-Apr-2011 |
rmacklem |
Modify the offset + size checks for read and write in the experimental NFS client to take care of overflows for the calls above the buffer cache layer in a manner similar to r220876. Thanks go to dillon at apollo.backplane.com for providing the snippet of code that does this.
MFC after: 2 weeks
|
220876 |
20-Apr-2011 |
rmacklem |
Modify the offset + size checks for read and write in the experimental NFS client to take care of overflows. Thanks go to dillon at apollo.backplane.com for providing the snippet of code that does this.
MFC after: 2 weeks
|
220810 |
19-Apr-2011 |
rmacklem |
Fix up handling of the nfsmount structure in read and write within the experimental NFS client. Mostly add mutex locking and use the same rsize, wsize during the operation by keeping a local copy of it. This is another change that brings it closer to the regular NFS client.
MFC after: 2 weeks
|
220807 |
18-Apr-2011 |
rmacklem |
Revert r220761 since, as kib@ pointed out, the case of adding the check to nfsrpc_close() isn't useful. Also, the check in nfscl_getcl() must be more involved, since it needs to check before and after the acquisition of the refcnt on nfsc_lock, while the mutex that protects the client state data is held.
|
220764 |
18-Apr-2011 |
rmacklem |
Add a vput() to nfs_lookitup() in the experimental NFS client for a case that will probably never happen. It can only happen if a server were to successfully lookup a file, but not return attributes for that file. Although technically allowed by the NFSv3 RFC, I doubt any server would ever do this. However, if it did, the client would have not vput()'d the new vnode when it needed to do so.
MFC after: 2 weeks
|
220763 |
18-Apr-2011 |
rmacklem |
Add vput() calls in two places in the experimental NFS client that would be needed if, in the future, nfscl_loadattrcache() were to return an error. Currently nfscl_loadattrcache() never returns an error, so these cases never currently happen.
MFC after: 2 weeks
|
220762 |
18-Apr-2011 |
rmacklem |
Change the mutex locking for several locations in the experimental NFS client's vnode op functions to make them compatible with the regular NFS client. I'll admit I'm not sure that the mutex locks around the assignments are needed, but the regular client has them, so I added them. Also, add handling of the case of partial attributes in setattr to be compatible with the regular client.
MFC after: 2 weeks
|
220761 |
17-Apr-2011 |
rmacklem |
Add checks for MNTK_UNMOUNTF at the beginning of three functions, so that threads don't get stuck in them during a forced dismount. nfs_sync/VFS_SYNC() needs this, since it is called by dounmount() before VFS_UNMOUNT(). The nfscl_nget() case makes sure that a thread doing an VOP_OPEN() or VOP_ADVLOCK() call doesn't get blocked before attempting the RPC. Attempting RPCs don't block, since they all fail once a forced dismount is in progress. The third one at the beginning of nfsrpc_close() is done so threads don't get blocked while doing VOP_INACTIVE() as the vnodes are cleared out. With these three changes plus a change to the umount(1) command so that it doesn't do "sync()" for the forced case seem to make forced dismounts work for the experimental NFS client.
MFC after: 2 weeks
|
220752 |
17-Apr-2011 |
rmacklem |
Get rid of the "nfscl: consider increasing kern.ipc.maxsockbuf" message that was generated when doing experimental NFS client mounts. I put that message in because the krpc would hang with the default size for mounts that used large rsize/wsize values. Since the bug that caused these hangs was fixed by r213756, I think the message is no longer needed.
MFC after: 2 weeks
|
220751 |
17-Apr-2011 |
rmacklem |
Fix up some of the sysctls for the experimental NFS client so that they use the same names as the regular client. Also add string descriptions for them.
MFC after: 2 weeks
|
220739 |
17-Apr-2011 |
rmacklem |
Change some defaults in the experimental NFS client to be the same as the regular NFS client for NFSv3. The main one is making use of a reserved port# the default. Also, set the retry limit for TCP the same and fix the code so that it doesn't disable readdirplus for NFSv4.
MFC after: 2 weeks
|
220735 |
17-Apr-2011 |
rmacklem |
Fix readdirplus in the experimental NFS client so that it skips over ".." to avoid a LOR race with nfs_lookup(). This fix is analagous to r138256 in the regular NFS client.
MFC after: 2 weeks
|
220732 |
16-Apr-2011 |
rmacklem |
Add a lktype flags argument to nfscl_nget() and ncl_nget() in the experimental NFS client so that its nfs_lookup() function can use cn_lkflags in a manner analagous to the regular NFS client.
MFC after: 2 weeks
|
220731 |
16-Apr-2011 |
rmacklem |
Add mutex locking on the nfs node in ncl_inactive() for the experimental NFS client.
MFC after: 2 weeks
|
220683 |
15-Apr-2011 |
rmacklem |
Change the experimental NFS client so that it creates nfsiod threads in the same manner as the regular NFS client after r214026 was committed. This resolves the lors fixed by r214026 and its predecessors for the regular client.
Reviewed by: jhb MFC after: 2 weeks
|
220648 |
14-Apr-2011 |
rmacklem |
Fix the experimental NFSv4 server so that it uses VOP_PATHCONF() to determine if a file system supports NFSv4 ACLs. Since VOP_PATHCONF() must be called with a locked vnode, the function is called before nfsvno_fillattr() and the result is passed in as an extra argument.
MFC after: 2 weeks
|
220645 |
14-Apr-2011 |
rmacklem |
Modify the experimental NFSv4 server so that it handles crossing of server mount points properly. The functions nfsvno_fillattr() and nfsv4_fillattr() were modified to take the extra arguments that are the mount point, a flag to indicate that it is a file system root and the mounted on fileno. The mount point argument needs to be busy when nfsvno_fillattr() is called, since the vp argument is not locked.
Reviewed by: kib MFC after: 2 weeks
|
220611 |
13-Apr-2011 |
rmacklem |
Add VOP_PATHCONF() support to the experimental NFS client so that it can, along with other things, report whether or not NFS4 ACLs are supported.
MFC after: 2 weeks
|
220610 |
13-Apr-2011 |
rmacklem |
Fix the experimental NFSv4 client so that it recognizes server mount point crossings correctly. It was testing the wrong flag. Also, try harder to make sure that the fsid is different than the one assigned to the client mount point, by hashing the server's fsid (just to create a different value deterministically) when it is the same.
MFC after: 2 weeks
|
220546 |
11-Apr-2011 |
rmacklem |
Vrele ni_startdir in the experimental NFS server for the case of NFSv2 getting an error return from VOP_MKNOD(). Without this patch, the server file system remains busy after an NFSv2 VOP_MKNOD() fails.
MFC after: 2 weeks
|
220530 |
10-Apr-2011 |
rmacklem |
Add some cleanup code to the module unload operation for the experimental NFS server, so that it doesn't leak memory when unloaded. However, unloading the NFSv4 server is not recommended, since all NFSv4 state will be lost by the unload and clients will have to recover the state after a server reload/restart as if the server crashed/rebooted.
MFC after: 2 weeks
|
220507 |
10-Apr-2011 |
rmacklem |
Add a VOP_UNLOCK() for the directory, when that is not what VOP_LOOKUP() returned. This fixes a bug in the experimental NFS server for the case where VFS_VGET() fails returning EOPNOTSUPP in the ReaddirPlus RPC, forcing the use of VOP_LOOKUP() instead.
MFC after: 2 weeks
|
220506 |
09-Apr-2011 |
kib |
Linuxolator calls VOP_READDIR with ncookies pointer. Implement a workaround for fdescfs to not panic when ncookies is not NULL, similar to the one committed as r152254, but simpler, due to fdescfs_readdir() not calling vfs_read_dirent().
PR: kern/156177 MFC after: 1 week
|
220400 |
06-Apr-2011 |
trasz |
Add RACCT_NOFILE accounting.
Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)
|
220152 |
30-Mar-2011 |
zack |
This patch fixes the Experimental NFS client to properly deal with 32 bit or 64 bit fileid's in NFSv2 and NFSv3. Without this fix, invalid casting (and sign extension) was creating problems for any fileid greater than 2^31.
We discovered this because we have test clusters with more than 2 billion allocated files and 64-bit ino_t's (and friend structures).
Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
220014 |
25-Mar-2011 |
kib |
Report EBUSY instead of EROFS for attempt of deleting or renaming the root directory of msdosfs mount. The VFS code would handle deletion case itself too, assuming VV_ROOT flag is not lost. The msdosfs_rename() should also note attempt to rename root via doscheckpath() or different mount point check leading to EXDEV. Nonetheless, keep the checks for now.
The change is inspired by NetBSD change referenced in PR, but return EBUSY like kern_unlinkat() does.
PR: kern/152079 MFC after: 1 week
|
219968 |
24-Mar-2011 |
jhb |
Fix some locking nits with the p_state field of struct proc: - Hold the proc lock while changing the state from PRS_NEW to PRS_NORMAL in fork to honor the locking requirements. While here, expand the scope of the PROC_LOCK() on the new process (p2) to avoid some LORs. Previously the code was locking the new child process (p2) after it had locked the parent process (p1). However, when locking two processes, the safe order is to lock the child first, then the parent. - Fix various places that were checking p_state against PRS_NEW without having the process locked to use PROC_LOCK(). Every place was already locking the process, just after the PRS_NEW check. - Remove or reduce the use of PROC_SLOCK() for places that were checking p_state against PRS_NEW. The PROC_LOCK() alone is sufficient for reading the current state. - Reorder fill_kinfo_proc() slightly so it only acquires PROC_SLOCK() once.
MFC after: 1 week
|
219028 |
25-Feb-2011 |
netchild |
Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/ PMC/SYSV/...).
No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed.
Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: arch@ (parts by rwatson, trasz, jhb) X-MFC after: to be determined in last commit with code from this project
|
219012 |
24-Feb-2011 |
jhb |
Use ffs() to locate free bits in the inode and block bitmaps rather than loops with bit shifts.
|
218965 |
23-Feb-2011 |
brucec |
Fix typos - remove duplicate "is".
PR: docs/154934 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days
|
218949 |
22-Feb-2011 |
alc |
Eliminate two dubious attempts at optimizing the implementation of a file's last accessed, modified, and changed times:
TMPFS_NODE_ACCESSED and TMPFS_NODE_CHANGED should be set unconditionally in tmpfs_remove() without regard to the number of hard links to the file. Otherwise, after the last directory entry for a file has been removed, a process that still has the file open could read stale values for the last accessed and changed times with fstat(2).
Similarly, tmpfs_close() should update the time-related fields even if all directory entries for a file have been removed. In this case, the effect is that the time-related fields will have values that are later than expected. They will correspond to the time at which fstat(2) is called.
In collaboration with: kib MFC after: 1 week
|
218909 |
21-Feb-2011 |
brucec |
Fix typos - remove duplicate "the".
PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days
|
218863 |
19-Feb-2011 |
alc |
tmpfs_remove() isn't modifying the file's data, so it shouldn't set TMPFS_NODE_MODIFIED on the node.
PR: 152488 Submitted by: Anton Yuzhaninov Reviewed by: kib MFC after: 1 week
|
218757 |
16-Feb-2011 |
bz |
Mfp4 CH=177274,177280,177284-177285,177297,177324-177325
VNET socket push back: try to minimize the number of places where we have to switch vnets and narrow down the time we stay switched. Add assertions to the socket code to catch possibly unset vnets as seen in r204147.
While this reduces the number of vnet recursion in some places like NFS, POSIX local sockets and some netgraph, .. recursions are impossible to fix.
The current expectations are documented at the beginning of uipc_socket.c along with the other information there.
Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb Tested by: zec
Tested by: Mikolaj Golub (to.my.trociny gmail.com) MFC after: 2 weeks
|
218681 |
14-Feb-2011 |
alc |
Further simplify tmpfs_reg_resize(). Also, update its comments, including style fixes.
|
218640 |
13-Feb-2011 |
alc |
Eliminate tn_reg.tn_aobj_pages. Instead, correctly maintain the vm object's size field. Previously, that field was always zero, even when the object tn_reg.tn_aobj contained numerous pages.
Apply style fixes to tmpfs_reg_resize().
In collaboration with: kib
|
218438 |
08-Feb-2011 |
jhb |
After reading a bitmap block for i-nodes or blocks, recheck the count of free i-nodes or blocks to handle a race where another thread might have allocated the last i-node or block while we were waiting for the buffer.
Tested by: dougb
|
218345 |
05-Feb-2011 |
alc |
Unless "cnt" exceeds MAX_COMMIT_COUNT, nfsrv_commit() and nfsvno_fsync() are incorrectly calling vm_object_page_clean(). They are passing the length of the range rather than the ending offset of the range.
Perform the OFF_TO_IDX() conversion in vm_object_page_clean() rather than the callers.
Reviewed by: kib MFC after: 3 weeks
|
218273 |
04-Feb-2011 |
jhb |
Collapse duplicate definitions of EXT2_SB().
Submitted by: Pedro F. Giffuni giffunip at yahoo
|
218190 |
02-Feb-2011 |
jhb |
Fix build with DIAGNOSTIC enabled.
Pointy hat to: jhb
|
218176 |
01-Feb-2011 |
jhb |
Some cosmetic fixes and remove a duplicate constant.
Submitted by: Pedro F. Giffuni giffunip at yahoo
|
218175 |
01-Feb-2011 |
jhb |
- Set the next_alloc fields for an i-node after allocating a new block so that future allocations start with most recently allocated block rather than the beginning of the filesystem. - Fix ext2_alloccg() to properly scan for 8 block chunks that are not aligned on 8-bit boundaries. Previously this was causing new blocks to be allocated in a highly fragmented fashion (block 0 of a file at lbn N, block 1 at lbn N + 8, block 2 at lbn N + 16, etc.). - Cosmetic tweaks to the currently-disabled fancy realloc sysctls.
PR: kern/153584 Discussed with: bde Tested by: Pedro F. Giffuni giffunip at yahoo, Zheng Liu (lz)
|
217922 |
27-Jan-2011 |
gnn |
Quick fix to a comment.
|
217896 |
26-Jan-2011 |
dchagin |
Add macro to test the sv_flags of any process. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures.
MFC after: 1 month
|
217703 |
21-Jan-2011 |
jhb |
- Move special inode constants to ext2_dinode.h and rename them to match NetBSD. - Add a constant for the HASJOURNAL compat flag.
PR: kern/153584 Submitted by: Pedro F. Giffuni giffunip at yahoo
|
217702 |
21-Jan-2011 |
jhb |
Restore support for the 'async' and 'sync' mount options lost when switching to nmount(2). While here, sort the options.
PR: kern/153584 Submitted by: Pedro F. Giffuni giffunip at yahoo MFC after: 1 week
|
217633 |
20-Jan-2011 |
kib |
In tmpfs_readdir(), normalize handling of the directory entries that either overflow the supplied buffer, or cause uiomove fail. Do not advance cached de when directory entry was not copied out. Do not return EOF when no entries could be copied due to first entry too large for supplied buffer, signal EINVAL instead.
Reported by: Beat G?tzi <beat chruetertee ch> MFC after: 1 week
|
217594 |
19-Jan-2011 |
jhb |
Fix build with KDB defined.
Pointy hat to: jhb Submitted by: jkim
|
217585 |
19-Jan-2011 |
jhb |
Whitespace and style fixes.
|
217584 |
19-Jan-2011 |
jhb |
Move calculation of 'bmask' earlier to match it's current location in ufs_lookup().
|
217582 |
19-Jan-2011 |
jhb |
Merge 118969 from UFS: Eliminate the i_devvp field from the incore inodes, we can get the same value from ip->i_ump->um_devvp.
Submitted by: Pedro F. Giffuni giffunip at yahoo MFC after: 1 week
|
217535 |
18-Jan-2011 |
rmacklem |
Fix the experimental NFSv4 server so that it uses VOP_ACCESSX() to check for VREAD_ACL instead of VOP_ACCESS().
MFC after: 3 days
|
217432 |
14-Jan-2011 |
rmacklem |
Modify the experimental NFSv4 server so that it posts a SIGUSR2 signal to the master nfsd daemon whenever the stable restart file has been modified. This will allow the master nfsd daemon to maintain an up to date backup copy of the file. This is enabled via the nfssvc() syscall, so that older nfsd daemons will not be signaled.
Reviewed by: jhb MFC after: 1 week
|
217336 |
12-Jan-2011 |
zack |
In the experimental NFS server, when converting an open-owner to a lock-owner, start at sequence id 1 instead of 0, to match up with both Solaris and Linux.
Reviewed by: rmacklem Approved by: zml (mentor)
|
217335 |
12-Jan-2011 |
zack |
Clean up the experimental NFS server replay cache when the module is unloaded.
Reviewed by: rmacklem Approved by: zml (mentor)
|
217176 |
09-Jan-2011 |
rmacklem |
Modify readdirplus in the experimental NFS server in a manner analogous to r216633 for the regular server. This change busies the file system so that VFS_VGET() is guaranteed to be using the correct mount point even during a forced dismount attempt. Since nfsd_fhtovp() is not called immediately before readdirplus, the patch is actually a clone of pjd@'s nfs_serv.c.4.patch instead of the one committed in r216633.
Reviewed by: kib MFC after: 10 days
|
217066 |
06-Jan-2011 |
rmacklem |
Delete the NFS_STARTWRITE() and NFS_ENDWRITE() macros that obscured vn_start_write() and vn_finished_write() for the old OpenBSD port, since most uses have been replaced by the correct calls.
MFC after: 12 days
|
217063 |
06-Jan-2011 |
rmacklem |
Since the VFS_LOCK_GIANT() code in the experimental NFS server is broken and the major file systems are now all mpsafe, modify the server so that it will only export mpsafe file systems. This was discussed on freebsd-fs@ and removes a fair bit of crufty code.
MFC after: 12 days
|
217023 |
05-Jan-2011 |
rmacklem |
Modify the experimental NFS server so that it calls vn_start_write() with a non-NULL vp. That way it will find the correct mount point mp and use that mp for the subsequent vn_finished_write() call. Also, it should fail without crashing if the mount point is being forced dismounted because vn_start_write() will set the mp NULL via VOP_GETWRITEMOUNT().
Reviewed by: kib MFC after: 12 days
|
217017 |
05-Jan-2011 |
rmacklem |
Fix the experimental NFS server to use vfs_busyfs() instead of vfs_getvfs() so that the mount point is busied for the VFS_FHTOVP() call. This is analagous to r185432 for the regular NFS server.
Reviewed by: kib MFC after: 12 days
|
216931 |
03-Jan-2011 |
rmacklem |
Fix the nlm so that it no longer depends on the regular nfs client and, as such, can be loaded for the experimental nfs client without the regular client.
Reviewed by: jhb MFC after: 2 weeks
|
216898 |
03-Jan-2011 |
rmacklem |
Fix the experimental NFS server so that it doesn't leak a reference count on the directory when creating device special files.
MFC after: 2 weeks
|
216897 |
03-Jan-2011 |
rmacklem |
Modify the experimental NFSv4 server so that the lookup ops return a locked vnode. This ensures that the associated mount point will always be valid for the code that follows the operation. Also add a couple of additional checks for non-error to the other functions that create file objects.
MFC after: 2 weeks
|
216894 |
02-Jan-2011 |
rmacklem |
Delete some cruft from the experimental NFS server that was only used by the OpenBSD port for its pseudo-fs.
MFC after: 2 weeks
|
216893 |
02-Jan-2011 |
rmacklem |
Add checks for VI_DOOMED and vn_lock() failures to the experimental NFS server, to handle the case where an exported file system is forced dismounted while an RPC is in progress. Further commits will fix the cases where a mount point is used when the associated vnode isn't locked.
Reviewed by: kib MFC after: 2 weeks
|
216875 |
01-Jan-2011 |
rmacklem |
Add support for shared vnode locks for the Read operation in the experimental NFSv4 server.
Reviewed by: kib MFC after: 2 weeks
|
216784 |
28-Dec-2010 |
rmacklem |
Delete the nfsvno_localconflict() function in the experimental NFS server since it is no longer used and is broken.
MFC after: 2 weeks
|
216700 |
25-Dec-2010 |
rmacklem |
Modify the experimental NFS server so that it uses LK_SHARED for RPC operations when it can. Since VFS_FHTOVP() currently always gets an exclusively locked vnode and is usually called at the beginning of each RPC, the RPCs for a given vnode will still be serialized. As such, passing a lock type argument to VFS_FHTOVP() would be preferable to doing the vn_lock() with LK_DOWNGRADE after the VFS_FHTOVP() call.
Reviewed by: kib MFC after: 2 weeks
|
216693 |
24-Dec-2010 |
rmacklem |
Add an argument to nfsvno_getattr() in the experimental NFS server, so that it can avoid calling VOP_ISLOCKED() when the vnode is known to be locked. This will allow LK_SHARED to be used for these cases, which happen to be all the cases that can use LK_SHARED. This does not fix any bug, but it reduces the number of calls to VOP_ISLOCKED() and prepares the code so that it can be switched to using LK_SHARED in a future patch.
Reviewed by: kib MFC after: 2 weeks
|
216692 |
24-Dec-2010 |
rmacklem |
Simplify vnode locking in the expeimental NFS server's readdir functions. In particular, get rid of two bogus VOP_ISLOCKED() calls. Removing the VOP_ISLOCKED() calls is the only actual bug fixed by this patch.
Reviewed by: kib MFC after: 2 weeks
|
216691 |
24-Dec-2010 |
rmacklem |
Since VOP_READDIR() for ZFS does not return monotonically increasing directory offset cookies, disable the UFS related loop that skips over directory entries at the beginning of the block for the experimental NFS server. This loop is required for UFS since it always returns directory entries starting at the beginning of the block that the requested directory offset is in. In discussion with pjd@ and mckusick@ it seems that this behaviour of UFS should maybe change, with this fix being an interim patch until then. This patch only fixes the experimental server, since pjd@ is working on a patch for the regular server.
Discussed with: pjd, mckusick MFC after: 5 days
|
216510 |
17-Dec-2010 |
rmacklem |
Fix two vnode locking problems in nfsd_recalldelegation() in the experimental NFSv4 server. The first was a bogus use of VOP_ISLOCKED() in a KASSERT() and the second was the need to lock the vnode for the nfsrv_checkremove() call. Also, delete a "__unused" that was bogus, since the argument is used.
Reviewed by: zack.kirsch at isilon.com MFC after: 2 weeks
|
216462 |
15-Dec-2010 |
jh |
Don't allow user created symbolic links to cover another entries marked with DE_USER. If a devfs rule hid such entry, it was possible to create infinite number of symbolic links with the same name.
Reviewed by: kib
|
216461 |
15-Dec-2010 |
jh |
- Assert that dm_lock is exclusively held in devfs_rules_apply() and in devfs_vmkdir() while adding the entry to de_list of the parent. - Apply devfs rules to newly created directories and symbolic links.
PR: kern/125034 Submitted by: Mateusz Guzik (original version)
|
216391 |
12-Dec-2010 |
jh |
Handle the special ruleset 0 in devfs_ruleset_use(). An attempt set the current ruleset to 0 with command "devfs ruleset 0" triggered a KASSERT in devfs_ruleset_create().
PR: kern/125030 Submitted by: Mateusz Guzik
|
216330 |
09-Dec-2010 |
rmacklem |
Disable attempts to establish a callback connection from the experimental NFSv4 server to a NFSv4 client when delegations are not being issued, even if the client advertises a callback path. This avoids a problem where a Linux client advertises a callback path that doesn't work, due to a firewall, and then times out an Open attempt before the FreeBSD server gives up its callback connection attempt. (Suggested by drb at karlov.mff.cuni.cz to fix the Linux client problem that he reported on the fs-stable mailing list.) The server should probably have a 1sec timeout on callback connection attempts when there are no delegations issued to the client, but that patch will require changes to the krpc and this serves as a work around until then.
Tested by: drb at karlov.mff.cuni.cz MFC after: 5 days
|
216128 |
02-Dec-2010 |
trasz |
Replace pointer to "struct uidinfo" with pointer to "struct ucred" in "struct vm_object". This is required to make it possible to account for per-jail swap usage.
Reviewed by: kib@ Tested by: pho@ Sponsored by: FreeBSD Foundation
|
216120 |
02-Dec-2010 |
kib |
For non-stopped threads, td_frame pointer is undefined. As a consequence, fill_regs() and fill_fpregs() access random data, usually on the thread kernel stack. Most often the td_frame points to the previous frame saved by last kernel entry sequence, but this is not guaranteed.
For /proc/<pid>/{regs,fpregs} read access, require the thread to be in stopped state. Otherwise, return EBUSY as is done for write case.
Reported and tested by: pho Approved by: des (procfs maintainer) MFC after: 1 week
|
215548 |
19-Nov-2010 |
kib |
Remove prtactive variable and related printf()s in the vop_inactive and vop_reclaim() methods. They seems to be unused, and the reported situation is normal for the forced unmount.
MFC after: 1 week X-MFC-note: keep prtactive symbol in vfs_subr.c
|
215052 |
09-Nov-2010 |
jhb |
Remove unused includes of <sys/mutex.h> and <machine/mutex.h>.
|
214513 |
29-Oct-2010 |
rmacklem |
Modify nfs_open() in the experimental NFS client to be compatible with the regular NFS client. Also, fix a couple of mutex lock issues.
MFC after: 1 week
|
214511 |
29-Oct-2010 |
rmacklem |
Add a call for nfsrpc_close() to ncl_reclaim() in the experimental NFSv4 client, since the call in ncl_inactive() might be missed because VOP_INACTIVE() is not guaranteed to be called before VOP_RECLAIM().
MFC after: 1 week
|
214406 |
26-Oct-2010 |
rmacklem |
Add a flag to the experimental NFSv4 client to indicate when delegations are being returned for reasons other than a Recall. Also, re-organize nfscl_recalldeleg() slightly, so that it leaves clearing NMODIFIED to the ncl_flush() call and invalidates the attribute cache after flushing. It is hoped that these changes might fix the problem others have seen when using the NFSv4 client with delegations enabled, since I can't reliably reproduce the problem. These changes only affect the client when doing NFSv4 mounts with delegations enabled.
MFC after: 10 days
|
214255 |
23-Oct-2010 |
rmacklem |
Modify the experimental NFSv4 server's file handle hash function to use the generic hash32_buf() function. Although adding the bytes seemed sufficient for UFS and ZFS, since most of the bytes are the same for file handles on the same volume, this might not be sufficient for other file systems. Use of a generic function also seems preferable to one specific to NFSv4.
Suggested by: gleb.kurtsou at gmail.com MFC after: 10 days
|
214224 |
22-Oct-2010 |
rmacklem |
Modify the file handle hash function in the experimental NFS server so that it will work better for non-UFS file systems. The new function simply sums the bytes of the fh_fid field of fhandle_t.
MFC after: 10 days
|
214149 |
21-Oct-2010 |
rmacklem |
Modify the experimental NFS server in a manner analagous to r214049 for the regular NFS server, so that it will not do a VOP_LOOKUP() of ".." when at the root of a file system when performing a ReaddirPlus RPC.
MFC after: 10 days
|
214053 |
19-Oct-2010 |
rmacklem |
Fix the type of the 3rd argument for nm_getinfo so that it works for architectures like sparc64.
Suggested by: kib MFC after: 2 weeks
|
214048 |
19-Oct-2010 |
rmacklem |
Modify the NFS clients and the NLM so that the NLM can be used by both clients. Since the NLM uses various fields of the nfsmount structure, those fields were extracted and put in a separate nfs_mountcommon structure stored in sys/nfs/nfs_mountcommon.h. This structure also has a function pointer for a function that extracts the required information from the mount point and nfs vnode for that particular client, for information stored differently by the clients.
Reviewed by: jhb MFC after: 2 weeks
|
214001 |
18-Oct-2010 |
kevlo |
Fix a possible race where the directory dirent is moved to the location that was used by ".." entry. This change seems fixed panic during attempt to access msdosfs data over nfs.
Reviewed by: kib MFC after: 1 week
|
213771 |
13-Oct-2010 |
rpaulo |
Ignore the return value of DE_INTERNALIZE().
|
213735 |
12-Oct-2010 |
avg |
tmpfs + sendfile: do not produce partially valid pages for vnode's tail
See r213730 for details of analogous change in ZFS.
MFC after: 3 days
|
213725 |
12-Oct-2010 |
jh |
Format prototypes to follow style(9) more closely.
Discussed with: kib, phk
|
213712 |
11-Oct-2010 |
rmacklem |
Try and make the nfsrv_localunlock() function in the experimental NFSv4 server more readable. Mostly changes to comments, but a case of >= is changed to >, since == can never happen. Also, I've added a couple of KASSERT()s and a slight optimization, since once the "else if" case happens, subsequent locks in the list can't have any effect. None of these changes fixes any known bug.
MFC after: 2 weeks
|
213664 |
10-Oct-2010 |
kib |
The r184588 changed the layout of struct export_args, causing an ABI breakage for old mount(2) syscall, since most struct <filesystem>_args embed export_args. The mount(2) is supposed to provide ABI compatibility for pre-nmount mount(8) binaries, so restore ABI to pre-r184588.
Requested and reviewed by: bde MFC after: 2 weeks
|
213543 |
08-Oct-2010 |
kib |
Add a comment describing the reason for calling cache_purge(fvp).
Requested by: danfe MFC after: 6 days
|
213508 |
07-Oct-2010 |
kib |
The msdosfs lookup is case insensitive. Several aliases may be inserted for a single directory entry. As a consequnce, name cache purge done by lookup for fvp when DELETE op for namei is specified, might be not enough to expunge all namecache entries that were installed for this direntry.
Explicitely call cache_purge(fvp) when msdosfs_rename() succeeded.
PR: kern/93634 MFC after: 1 week
|
213363 |
02-Oct-2010 |
alc |
M_USE_RESERVE has been deprecated for a decade. Eliminate any uses that have no run-time effect.
|
213221 |
27-Sep-2010 |
jh |
Add a new function devfs_dev_exists() to be able to find out if a specific devfs path already exists.
The function will be used from kern_conf.c to detect duplicate device registrations. Callers must hold the devmtx mutex.
Reviewed by: kib
|
213215 |
27-Sep-2010 |
jh |
Add reference counting for devfs paths containing user created symbolic links. The reference counting is needed to be able to determine if a specific devfs path exists. For true device file paths we can traverse the cdevp_list but a separate directory list is needed for user created symbolic links.
Add a new directory entry flag DE_USER to mark entries which should unreference their parent directory on deletion.
A new function to traverse cdevp_list and the directory list will be introduced in a separate commit.
Idea from: kib Reviewed by: kib
|
212966 |
21-Sep-2010 |
jh |
Modify devfs_fqpn() for future use in devfs path reference counting code:
- Accept devfs_mount and devfs_dirent as the arguments instead of a vnode. This generalizes the function so that it can be used from contexts where vnode references are not available. - Accept NULL cnp argument. No '/' will be appended, if a NULL cnp is provided. - Make the function global and add its prototype to devfs.h.
Reviewed by: kib
|
212834 |
19-Sep-2010 |
rmacklem |
Fix nfsrv_freeallnfslocks() in the experimental NFSv4 server so that it frees local locks correctly upon close. In order for nfsrv_localunlock() to work correctly, the lock can no longer be in the lockowner's stateid list. As such, nfsrv_freenfslock() has to be called before nfsrv_localunlock(), to get rid of the lock structure on the lockowner's stateid list. This only affected operation when local locks (vfs.newnfs.enable_locallocks=1) are enabled, which is not the default at this time.
MFC after: 1 week
|
212833 |
19-Sep-2010 |
rmacklem |
Fix the experimental NFSv4 server so that it performs local VOP_ADVLOCK() unlock operations correctly. It was passing in F_SETLK instead of F_UNLCK as the operation for the unlock case. This only affected operation when local locking (vfs.newnfs.enable_locallocks=1) was enabled.
MFC after: 1 week
|
212826 |
18-Sep-2010 |
jh |
- For consistency, remove "." and ".." entries from de_dlist before calling devfs_delete() (and thus possibly dropping dm_lock) in devfs_rmdir_empty(). - Assert that we don't return doomed entries from devfs_find(). [1]
Suggested by: kib [1] Reviewed by: kib
|
212660 |
15-Sep-2010 |
jh |
Remove empty devfs directories automatically.
devfs_delete() now recursively removes empty parent directories unless the DEVFS_DEL_NORECURSE flag is specified. devfs_delete() can't be called anymore with a parent directory vnode lock held because the possible parent directory deletion needs to lock the vnode. Thus we unlock the parent directory vnode in devfs_remove() before calling devfs_delete().
Call devfs_populate_vp() from devfs_symlink() and devfs_vptocnp() as now directories can get removed.
Add a check for DE_DOOMED flag to devfs_populate_vp() because devfs_delete() drops dm_lock before the VI_DOOMED vnode flag gets set. This ensures that devfs_populate_vp() returns an error for directories which are in progress of deletion.
Reviewed by: kib Discussed on: freebsd-current (mostly silence)
|
212650 |
15-Sep-2010 |
avg |
tmpfs, zfs + sendfile: mark page bits as valid after populating it with data
Otherwise, adding insult to injury, in addition to double-caching of data we would always copy the data into a vnode's vm object page from backend. This is specific to sendfile case only (VOP_READ with UIO_NOCOPY).
PR: kern/141305 Reported by: Wiktor Niesiobedzki <bsd@vink.pl> Reviewed by: alc Tested by: tools/regression/sockets/sendfile MFC after: 2 weeks
|
212443 |
10-Sep-2010 |
rmacklem |
This patch applies one of the two fixes suggested by zack.kirsch at isilon.com for a race between nfsrv_freeopen() and nfsrv_getlockfile() in the experimental NFS server that he found during testing. Although nfsrv_freeopen() holds a sleep lock on the lock file structure when called with cansleep != 0, nfsrv_getlockfile() could still search the list, once it acquired the NFSLOCKSTATE() mutex. I believe that acquiring the mutex in nfsrv_freeopen() fixes the race.
MFC after: 2 weeks
|
212439 |
10-Sep-2010 |
rmacklem |
Fix the NFSVNO_CMPFH() macro in the experimental NFS server so that it works correctly for ZFS file handles. It is possible to have two ZFS file handles that differ only in the bytes in the fid_reserved field of the generic "struct fid" and comparing the bytes in fid_data didn't catch this case. This patch changes the macro to compare all bytes of "struct fid".
Tested by: gull at gull.us MFC after: 2 weeks
|
212362 |
09-Sep-2010 |
rmacklem |
Fix the experimental NFS client so that it doesn't panic when NFSv2,3 byte range locking is attempted. A fix that allows the nlm_advlock() to work with both clients is in progress, but may take a while. As such, I am doing this commit so that the kernel doesn't panic in the meantime.
Submitted by: jh MFC after: 2 weeks
|
212305 |
07-Sep-2010 |
ivoras |
Avoid "Entry can disappear before we lock fdvp" panic.
PR: 150143 Submitted by: Gleb Kurtsou <gk at FreeBSD.org> Pretty sure it won't blow up: mckusick MFC after: 2 weeks
|
212293 |
07-Sep-2010 |
jhb |
Store the full timestamp when caching timestamps of files and directories for purposes of validating name cache entries. This closes races where two updates to a file or directory within the same second could result in stale entries in the name cache. While here, remove the 'n_expiry' field as it is no longer used.
Reviewed by: rmacklem MFC after: 1 week
|
212221 |
05-Sep-2010 |
daichi |
Allowed unionfs to use whiteout not supporting file system as upper layer. Until now, unionfs prevents to use that kind of file system as upper layer. This time, I changed to allow that kind of file system as upper layer. By this change, you can use whiteout not supporting file system (e.g., especially for tmpfs) as upper layer. It's very useful for combination of tmpfs as upper layer and read only file system as lower layer.
By difinition, without whiteout support from the file system backing the upper layer, there is no way that delete and rename operations on lower layer objects can be done. EOPNOTSUPP is returned for this kind of operations as generated by VOP_WHITEOUT() along with any others which would make modifica tions to the lower layer, such as chmod(1).
This change is suggested by ed.
Submitted by: ed
|
212217 |
05-Sep-2010 |
rmacklem |
Change the code in ncl_bioread() in the experimental NFS client to return an error when rabp is not set, so it behaves the same way as the regular NFS client for this case. It does not affect NFSv4, since nfs_getcacheblk() only fails for "intr" mounts and NFSv4 can't use the "intr" mount option.
MFC after: 2 weeks
|
212216 |
05-Sep-2010 |
rmacklem |
Disable use of the NLM in the experimental NFS client, since it will crash the kernel because it uses the nfsmount and nfsnode structures of the regular NFS client.
MFC after: 2 weeks
|
212079 |
01-Sep-2010 |
lulf |
- Remove duplicate comment.
PR: kern/148820 Submitted by: pluknet <pluknet - at - gmail.com>
|
212043 |
31-Aug-2010 |
rmacklem |
Add a null_remove() function to nullfs, so that the v_usecount of the lower level vnode is incremented to greater than 1 when the upper level vnode's v_usecount is greater than one. This is necessary for the NFS clients, so that they will do a silly rename of the file instead of actually removing it when the file is still in use. It is "racy", since the v_usecount is incremented in many places in the kernel with minimal synchronization, but an extraneous silly rename is preferred to not doing a silly rename when it is required. The only other file systems that currently check the value of v_usecount in their VOP_REMOVE() functions are nwfs and smbfs. These file systems choose to fail a remove when the v_usecount is greater than 1 and I believe will function more correctly with this patch, as well.
Tested by: to.my.trociny at gmail.com Submitted by: to.my.trociny at gmail.com (earlier version) Reviewed by: kib MFC after: 2 weeks
|
211953 |
28-Aug-2010 |
rmacklem |
Add acquisition of a reference count on nfsv4root_lock to the nfsd_recalldelegation() function, since this function is called by nfsd threads when they are handling NFSv2 or NFSv3 RPCs, where no reference count would have been acquired.
MFC after: 2 weeks
|
211951 |
28-Aug-2010 |
rmacklem |
The timer routine in the experimental NFS server did not acquire the correct mutex when checking nfsv4root_lock. Although this could be fixed by adding mutex lock/unlock calls, zack.kirsch at isilon.com suggested a better fix that uses a non-blocking acquisition of a reference count on nfsv4root_lock. This fix allows the weird NFSLOCKSTATE(); NFSUNLOCKSTATE(); synchronization to be deleted. This patch applies this fix.
Tested by: zack.kirsch at isilon.com MFC after: 2 weeks
|
211847 |
26-Aug-2010 |
jh |
Set de_dir for user created symbolic links. This will be needed to be able to resolve their parent directories.
|
211826 |
25-Aug-2010 |
trasz |
Revert r210194, adding a comment explaining why calls to chgproccnt() in unionfs are actually needed. I have a better fix in trasz_hrl p4 branch, but now is not a good moment to commit it.
Reported by: Alex Kozlov
|
211816 |
25-Aug-2010 |
jh |
Call devfs_populate_vp() from devfs_getattr(). It was possible that fstat(2) returned stale information through an open file descriptor.
|
211628 |
22-Aug-2010 |
jh |
Introduce and use devfs_populate_vp() to unlock a vnode before calling devfs_populate(). This is a prerequisite for the automatic removal of empty directories which will be committed in the future.
Reviewed by: kib (previous version)
|
211598 |
22-Aug-2010 |
ed |
Add support for whiteouts on tmpfs.
Right now unionfs only allows filesystems to be mounted on top of another if it supports whiteouts. Even though I have sent a patch to daichi@ to let unionfs work without it, we'd better also add support for whiteouts to tmpfs.
This patch implements .vop_whiteout and makes necessary changes to lookup() and readdir() to take them into account. We must also make sure that when adding or removing a file, we honour the componentname's DOWHITEOUT and ISWHITEOUT, to prevent duplicate filenames.
MFC after: 1 month
|
211531 |
20-Aug-2010 |
jhb |
Add dedicated routines to toggle lockmgr flags such as LK_NOSHARE and LK_CANRECURSE after a lock is created. Use them to implement macros that otherwise manipulated the flags directly. Assert that the associated lockmgr lock is exclusively locked by the current thread when manipulating these flags to ensure the flag updates are safe. This last change required some minor shuffling in a few filesystems to exclusively lock a brand new vnode slightly earlier.
Reviewed by: kib MFC after: 3 days
|
211513 |
19-Aug-2010 |
jh |
Call dev_rel() in error paths.
Reported by: kib Reviewed by: kib MFC after: 2 weeks
|
211226 |
12-Aug-2010 |
jh |
Allow user created symbolic links to cover device files and directories if the device file appears during or after the link creation.
User created symbolic links are now inserted at the head of the directory entry list after the "." and ".." entries. A new directory entry flag DE_COVERED indicates that an entry is covered by a symbolic link.
PR: kern/114057 Reviewed by: kib Idea from: kib Discussed on: freebsd-current (mostly silence)
|
210997 |
07-Aug-2010 |
rwatson |
Properly bounds check ioctl/pioctl data arguments for Coda:
1. Use unsigned rather than signed lengths 2. Bound messages to/from Venus to VC_MAXMSGSIZE 3. Bound messages to/from general user processes to VC_MAXDATASIZE 4. Update comment regarding data limits for pioctl
Without (1) and (3), it may be possible for unprivileged user processes to read sensitive portions of kernel memory. This issue is only present if the Coda kernel module is loaded and venus (the userspace Coda daemon) is running and has /coda mounted.
As Coda is considered experimental and production use is warned against in the coda(4) man page, and because Coda must be explicitly configured for a configuration to be vulnerable, we won't be issuing a security advisory. However, if you are using Coda, then you are advised to apply these fixes.
Reported by: Dan J. Rosenberg <drosenberg at vsecurity.com> Obtained from: NetBSD (Christos Zoulas) Security: Kernel memory disclosure; no advisory as feature experimental MFC after: 3 days
|
210925 |
06-Aug-2010 |
kib |
Enable shared lookups and externed shared ops for devfs.
In collaboration with: pho MFC after: 1 month
|
210923 |
06-Aug-2010 |
kib |
Add new make_dev_p(9) flag MAKEDEV_ETERNAL to inform devfs that created cdev will never be destroyed. Propagate the flag to devfs vnodes as VV_ETERNVALDEV. Use the flags to avoid acquiring devmtx and taking a thread reference on such nodes.
In collaboration with: pho MFC after: 1 month
|
210921 |
06-Aug-2010 |
kib |
Enable shared locks for the devfs vnodes. Honor the locking mode requested by lookup(). This should be a nop at the moment.
In collaboration with: pho MFC after: 1 month
|
210918 |
06-Aug-2010 |
kib |
Initialize VV_ISTTY vnode flag on the devfs vnode creation instead of doing it on each open.
In collaboration with: pho MFC after: 1 month
|
210786 |
03-Aug-2010 |
rmacklem |
Modify the return value for nfscl_mustflush() from boolean_t, which I mistakenly thought was correct w.r.t. style(9), back to int and add the checks for != 0. This is just a stylistic modification.
MFC after: 1 week
|
210455 |
24-Jul-2010 |
rmacklem |
Move sys/nfsclient/nfs_lock.c into sys/nfs and build it as a separate module that can be used by both the regular and experimental nfs clients. This fixes the problem reported by jh@ where /dev/nfslock would be registered twice when both nfs clients were used. I also defined the size of the lm_fh field to be the correct value, as it should be the maximum size of an NFSv3 file handle.
Reviewed by: jh MFC after: 2 weeks
|
210268 |
19-Jul-2010 |
rmacklem |
For the experimental NFSv4 server's dumplocks operation, add the MPSAFE flag to cn_flags so that it doesn't panic. The panics weren't seen since nfsdumpstate(8) is broken for the "-l" case, so this was never done. I'll do a separate commit to fix nfsdumpstate(8).
Submitted by: zack.kirsch at isilon.com MFC after: 2 weeks
|
210227 |
18-Jul-2010 |
rmacklem |
Add a call to nfscl_mustflush() in nfs_close() of the experimental NFSv4 client, so that attributes are not acquired from the server when a delegation for the file is held. This can reduce the number of Getattr Ops significantly.
MFC after: 2 weeks
|
210213 |
18-Jul-2010 |
trasz |
Fix build.
Submitted by: Andreas Tobler <andreast-list at fgznet.ch>
|
210201 |
18-Jul-2010 |
rmacklem |
Change the nfscl_mustflush() function in the experimental NFSv4 client to return a boolean_t in order to make it more compatible with style(9).
MFC after: 2 weeks
|
210194 |
17-Jul-2010 |
trasz |
Remove updating process count by unionfs. It serves no purpose, unionfs just needs root credentials for a moment.
|
210178 |
16-Jul-2010 |
rmacklem |
Patch the experimental NFSv4 server so that it acquires a reference count on nfsv4rootfs_lock when dumping state, since these functions are not called by nfsd threads. Without this reference count, it is possible for an nfsd thread to acquire an exclusive lock on nfsv4rootfs_lock while the dump is in progress and then change the lists, potentially causing a crash.
Reported by: zack.kirsch at isilon.com MFC after: 2 weeks
|
210172 |
16-Jul-2010 |
jhb |
Revert the previous commit. The race is not applicable to the lockmgr implementation in 8.0 and later as its flags field does not hold dynamic state such as waiters flags, but is only modified in lockinit() aside from VN_LOCK_*().
Discussed with: attilio
|
210171 |
16-Jul-2010 |
jhb |
When the MNTK_EXTENDED_SHARED mount option was added, some filesystems were changed to defer the setting of VN_LOCK_ASHARE() (which clears LK_NOSHARE in the vnode lock's flags) until after they had determined if the vnode was a FIFO. This occurs after the vnode has been inserted a VFS hash or some similar table, so it is possible for another thread to find this vnode via vget() on an i-node number and block on the vnode lock. If the lockmgr interlock (vnode interlock for vnode locks) is not held when clearing the LK_NOSHARE flag, then the lk_flags field can be clobbered. As a result the thread blocked on the vnode lock may never get woken up. Fix this by holding the vnode interlock while modifying the lock flags in this case.
MFC after: 3 days
|
210154 |
16-Jul-2010 |
rmacklem |
Delete comments related to soft clock interrupts that don't apply to the FreeBSD port of the experimental NFSv4 server.
Submitted by: zack.kirsch at isilon.com MFC after: 2 weeks
|
210136 |
15-Jul-2010 |
jhb |
Retire the NFS access cache timestamp structure. It was used in VOP_OPEN() to avoid sending multiple ACCESS/GETATTR RPCs during a single open() between VOP_LOOKUP() and VOP_OPEN(). Now we always send the RPC in VOP_LOOKUP() and not VOP_OPEN() in the cases that multiple RPCs could be sent.
MFC after: 2 weeks
|
210135 |
15-Jul-2010 |
jhb |
Merge 208603, 209946, and 209948 to the new NFS client: Move attribute cache flushes from VOP_OPEN() to VOP_LOOKUP() to provide more graceful recovery for stale filehandles and eliminate the need for conditionally clearing the attribute cache in the !NMODIFIED case in VOP_OPEN().
Reviewed by: rmacklem MFC after: 2 weeks
|
210102 |
15-Jul-2010 |
rmacklem |
This patch fixes a bug in the experimental NFSv4 server where it released a reference count on nfsv4rootfs_lock erroneously when administrative revocation of state was done.
Submitted by: zack.kirsch at isilon.com MFC after: 2 weeks
|
210034 |
13-Jul-2010 |
rmacklem |
For the experimental NFSv4 client, make sure that attributes that predate the issue of a delegation are not cached once the delegation is held. This is necessary, since cached attributes remain valid while the delegation is held.
MFC after: 2 weeks
|
210032 |
13-Jul-2010 |
rmacklem |
For the experimental NFSv4 client, do not use cached attributes that were invalidated, even when a delegation for the file is held.
MFC after: 2 weeks
|
210030 |
13-Jul-2010 |
rmacklem |
Fix a bogus comment that mentions lru lists that don't exist.
Reported by: zack.kirsch at isilon.com MFC after: 2 weeks
|
209425 |
22-Jun-2010 |
avg |
udf_vnops: cosmetic followup to r208671 - better looking code
Suggested by: jhb MFC after: 3 days
|
209320 |
18-Jun-2010 |
alc |
Eliminate unnecessary page queues locking.
|
209226 |
16-Jun-2010 |
alc |
Eliminate unnecessary page queues locking.
|
209191 |
15-Jun-2010 |
rmacklem |
Add MODULE_DEPEND() macros to the experimental NFS client and server so that the modules will load when kernels are built with none of the NFS* configuration options specified. I believe this resolves the problems reported by PR kern/144458 and the email on freebsd-stable@ posted by Dmitry Pryanishnikov on June 13.
Tested by: kib PR: kern/144458 Reviewed by: kib MFC after: 1 week
|
209120 |
13-Jun-2010 |
kib |
In NFS clients, instead of inconsistently using #ifdef DIAGNOSTIC and #ifndef DIAGNOSTIC for debug assertions, prefer KASSERT(). Also change one #ifdef DIAGNOSTIC in the new nfs server.
Submitted by: Mikolaj Golub <to.my.trociny gmail com> MFC after: 2 weeks
|
209062 |
11-Jun-2010 |
avg |
fix a few cases where a string is passed via format argument instead of via %s
Most of the cases looked harmless, but this is done for the sake of correctness. In one case it even allowed to drop an intermediate buffer.
Found by: clang MFC after: 2 week
|
208951 |
09-Jun-2010 |
jh |
Add a new function devfs_parent_dirent() for resolving devfs parent directory entry. Use the new function in devfs_fqpn(), devfs_lookupx() and devfs_vptocnp() instead of manually resolving the parent entry.
Reviewed by: kib
|
208717 |
01-Jun-2010 |
jh |
Don't try to call cdevsw d_close() method when devfs_close() is called because of insmntque1() failure.
Found with: stress2 Suggested and reviewed by: kib
|
208671 |
31-May-2010 |
avg |
udf_readlink: fix malloc call with uninitialized size parameter
Found by: clang static analyzer MFC after: 4 days
|
208254 |
18-May-2010 |
rmacklem |
Allow the experimental NFSv4 client to use cached attributes when a write delegation is held. Also, add a missing mtx_unlock() call for the ACL debugging code.
MFC after: 5 days
|
208234 |
18-May-2010 |
rmacklem |
Add a sanity check for a negative args.fhsize to the experimental NFS client.
MFC after: 5 days
|
208128 |
16-May-2010 |
kib |
Disable bypass for the vop_advlockpurge(). The vop is called after vop_revoke(), the v_data is already destroyed.
Reported and tested by: ed
|
207848 |
10-May-2010 |
kib |
The thread_unsuspend() requires both process mutex and process spinlock locked. Postpone the process unlock till the thread_unsuspend() is called.
Approved by: des (procfs maintainer) MFC after: 1 week
|
207847 |
10-May-2010 |
kib |
For detach procfs ctl command, also clear P_STOPPED_TRACE process stop flag, and for each thread, TDB_SUSPEND debug flag, same as it is done by exit1() for orphaned debugee.
Approved by: des (procfs maintainer) MFC after: 1 week
|
207785 |
08-May-2010 |
rmacklem |
Fix typos in macros.
PR: kern/146375 Submitted by: simon AT comsys.ntu-kpi.kiev.ua MFC after: 1 week
|
207764 |
08-May-2010 |
rmacklem |
Patch the experimental NFS client so that it works for NFSv2 by adding the necessary mapping from NFSv3 procedure numbers to NFSv2 procedure numbers when doing NFSv2 RPCs.
MFC after: 1 week
|
207746 |
07-May-2010 |
alc |
Push down the page queues lock into vm_page_activate().
|
207729 |
06-May-2010 |
kib |
Add MAKEDEV_NOWAIT flag to make_dev_credf(9), to create a device node in a no-sleep context. If resource allocation cannot be done without sleep, make_dev_credf() fails and returns NULL.
Reviewed by: jh MFC after: 2 weeks
|
207728 |
06-May-2010 |
alc |
Eliminate page queues locking around most calls to vm_page_free().
|
207719 |
06-May-2010 |
trasz |
Style fixes and removal of unneeded variable.
Submitted by: bde@
|
207669 |
05-May-2010 |
alc |
Acquire the page lock around all remaining calls to vm_page_free() on managed pages that didn't already have that lock held. (Freeing an unmanaged page, such as the various pmaps use, doesn't require the page lock.)
This allows a change in vm_page_remove()'s locking requirements. It now expects the page lock to be held instead of the page queues lock. Consequently, the page queues lock is no longer required at all by callers to vm_page_rename().
Discussed with: kib
|
207662 |
05-May-2010 |
trasz |
Move checking against RLIMIT_FSIZE into one place, vn_rlimit_fsize().
Reviewed by: kib
|
207644 |
05-May-2010 |
alc |
Push down the acquisition of the page queues lock into vm_page_unwire().
Update the comment describing which lock should be held on entry to vm_page_wire().
Reviewed by: kib
|
207584 |
03-May-2010 |
kib |
Lock the page around vm_page_activate() and vm_page_deactivate() calls where it was missed. The wrapped fragments now protect wire_count with page lock.
Reviewed by: alc
|
207573 |
03-May-2010 |
alc |
Acquire the page lock around vm_page_unwire() and vm_page_wire().
Reviewed by: kib
|
207530 |
02-May-2010 |
alc |
It makes no sense for vm_page_sleep_if_busy()'s helper, vm_page_sleep(), to unconditionally set PG_REFERENCED on a page before sleeping. In many cases, it's perfectly ok for the page to disappear, i.e., be reclaimed by the page daemon, before the caller to vm_page_sleep() is reawakened. Instead, we now explicitly set PG_REFERENCED in those cases where having the page persist until the caller is awakened is clearly desirable. Note, however, that setting PG_REFERENCED on the page is still only a hint, and not a guarantee that the page should persist.
|
207350 |
28-Apr-2010 |
rmacklem |
For the experimental NFS client, it should always flush dirty buffers before closing the NFSv4 opens, as the comment states. This patch deletes the call to nfscl_mustflush() which would return 0 for the case where a delegation still exists, which was incorrect and could cause crashes during recovery from an expired lease.
MFC after: 1 week
|
207349 |
28-Apr-2010 |
rmacklem |
Delete a diagnostic statement that is no longer useful from the experimental NFS client.
MFC after: 1 week
|
207170 |
24-Apr-2010 |
rmacklem |
An NFSv4 server will reply NFSERR_GRACE for non-recovery RPCs during the grace period after startup. This grace period must be at least the lease duration, which is typically 1-2 minutes. It seems prudent for the experimental NFS client to wait a few seconds before retrying such an RPC, so that the server isn't flooded with non-recovery RPCs during recovery. This patch adds an argument to nfs_catnap() to implement a 5 second delay for this case.
MFC after: 1 week
|
207082 |
22-Apr-2010 |
rmacklem |
When the experimental NFS client is handling an NFSv4 server reboot with delegations enabled, the recovery could fail if the renew thread is trying to return a delegation, since it will not do the recovery. This patch fixes the above by having nfscl_recalldeleg() fail with the I/O operations returning EIO, so that they will be attempted later. Most of the patch consists of adding an argument to various functions to indicate the delegation recall case where this needs to be done.
MFC after: 1 week
|
206894 |
20-Apr-2010 |
kib |
The cache_enter(9) function shall not be called for doomed dvp. Assert this.
In the reported panic, vdestroy() fired the assertion "vp has namecache for ..", because pseudofs may end up doing cache_enter() with reclaimed dvp, after dotdot lookup temporary unlocked dvp. Similar problem exists in ufs_lookup() for "." lookup, when vnode lock needs to be upgraded.
Verify that dvp is not reclaimed before calling cache_enter().
Reported and tested by: pho Reviewed by: kan MFC after: 2 weeks
|
206880 |
20-Apr-2010 |
rmacklem |
For the experimental NFS client doing an NFSv4 mount, set the NFSCLFLAGS_RECVRINPROG while doing recovery from an expired lease in a manner similar to r206818 for server reboot recovery. This will prevent the function that acquires stateids for I/O operations from acquiring out of date stateids during recovery. Also, fix up mutex locking on the nfsc_flags field.
MFC after: 1 week
|
206818 |
18-Apr-2010 |
rmacklem |
Avoid extraneous recovery cycles in the experimental NFS client when an NFSv4 server reboots, by doing two things. 1 - Make the function that acquires a stateid for I/O operations block until recovery is complete, so that it doesn't acquire out of date stateids. 2 - Only allow a recovery once every 1/2 of a lease duration, since the NFSv4 server must provide a recovery grace period of at least a lease duration. This should avoid recoveries caused by an out of date stateid that was acquired for an I/O op. just before a recovery cycle started.
MFC after: 1 week
|
206698 |
16-Apr-2010 |
jh |
Revert r206560. The change doesn't work correctly in all cases with multiple devfs mounts.
|
206690 |
15-Apr-2010 |
rmacklem |
Add mutex lock calls to 2 cases in the experimental NFS client's renew thread where they were missing.
MFC after: 1 week
|
206688 |
15-Apr-2010 |
rmacklem |
The experimental NFS client was not filling in recovery credentials for opens done locally in the client when a delegation for the file was held. This could cause the client to crash in crsetgroups() when recovering from a server crash/reboot. This patch fills in the recovery credentials for this case, in order to avoid the client crash. Also, add KASSERT()s to the credential copy functions, to catch any other cases where the credentials aren't filled in correctly.
MFC after: 1 week
|
206560 |
13-Apr-2010 |
jh |
- Ignore and report duplicate and empty device names in devfs_populate_loop() instead of causing erratic behavior. Currently make_dev(9) can't fail, so there is no way to report an error to make_dev(9) callers. - Disallow using "." and ".." in device path names. It didn't work previously but now it is reported rather than panicing. - Treat multiple sequential slashes as single in device path names.
Discussed with: pjd
|
206361 |
07-Apr-2010 |
joel |
Switch to our preferred 2-clause BSD license.
Approved by: bp
|
206236 |
06-Apr-2010 |
rmacklem |
Harden the experimental NFS server a little, by adding range checks on the length of the client's open/lock owner name. Also, add free()'s for one case where they were missing and would have caused a leak if NFSERR_BADXDR had been replied. Probably never happens, but the leak is now plugged, just in case.
MFC after: 2 weeks
|
206210 |
05-Apr-2010 |
rwatson |
Synchronize Coda kernel module definitions in our coda.h to Coda 6's coda.h:
- CodaFid typdef -> struct CodaFid throughout. - Use unsigned int instead of unsigned long for venus_dirent and other cosmetic fixes. - Introduce cuid_t and cgid_t and use instead of uid_t and gid_t in RPCs. - Synchronize comments and macros. - Use u_int32_t instead of unsigned long for coda_out_hdr.
With these changes, a 64-bit Coda kernel module now works with coda6_client, whereas previous userspace and kernel versions of RPCs differed sufficiently to prevent using the file system. This has been verified only with casual testing, but /coda is now usable for at least basic operations on amd64.
MFC after: 1 week
|
206206 |
05-Apr-2010 |
rwatson |
Correct definition of CIOC_KERNEL_VERSION Coda ioctl() for systems where sizeof(int) != sizeof(sizeof(int)), or the ioctl will return EINVAL.
MFC after: 3 days
|
206170 |
04-Apr-2010 |
rmacklem |
Harden the experimental NFS server a little, by adding extra checks in the readdir functions for non-positive byte count arguments. For the negative case, set it to the maximum allowable, since it was actually a large positive value (unsigned) on the wire. Also, fix up the readdir function comment a bit.
Suggested by: dillon AT apollo.backplane.com MFC after: 2 weeks
|
206098 |
02-Apr-2010 |
avg |
mountmsdosfs: reject too high value of bytes per cluster
Bytes per cluster are calcuated as bytes per sector times sectors per cluster. Too high value can overflow an internal variable with type that can hold only values in valid range. Trying to use a wider type results in an attempt to read more than MAXBSIZE at once, a panic. Unfortunately, it is FreeBSD newfs_msdos that produces filesystems with invalid parameters for certain types of media.
Reported by: Fabian Keil <freebsd-listen@fabiankeil.de>, Paul B. Mahol <onemda@gmail.com> Discussed with: bde, kib MFC after: 1 week X-ToDo: fix newfs_msdos
|
206093 |
02-Apr-2010 |
kib |
Add function vop_rename_fail(9) that performs needed cleanup for locks and references of the VOP_RENAME(9) arguments. Use vop_rename_fail() in deadfs_rename().
Tested by: Mikolaj Golub MFC after: 1 week
|
206063 |
02-Apr-2010 |
rmacklem |
For the experimental NFS server, add a call to free the lookup path buffer for one case where it was missing when doing mkdir. This could have conceivably resulted in a leak of a buffer, but a leak was never observed during testing, so I suspect it would have occurred rarely, if ever, in practice.
MFC after: 2 weeks
|
206061 |
02-Apr-2010 |
rmacklem |
Add SAVENAME to the cn_flags for all cases in the experimental NFS server for the CREATE cn_nameiop where SAVESTART isn't set. I was not aware that this needed to be done by the caller until recently.
Tested by: lampa AT fit.vutbr.cz (link case) Submitted by: lampa AT fit.vutbr.cz (link case) MFC after: 2 weeks
|
205941 |
30-Mar-2010 |
rmacklem |
This patch should fix handling of byte range locks locally on the server for the experimental nfs server. When enabled by setting vfs.newnfs.locallocks_enable to non-zero, the experimental nfs server will now acquire byte range locks on the file on behalf of NFSv4 clients, such that lock conflicts between the NFSv4 clients and processes running locally on the server, will be recognized and handled correctly.
MFC after: 2 weeks
|
205663 |
26-Mar-2010 |
rmacklem |
Patch the experimental NFS server in a manner analagous to r205661 for the regular NFS server, to ensure that ESTALE is returned to the client for all errors returned by VFS_FHTOVP().
MFC after: 2 weeks
|
205572 |
24-Mar-2010 |
rmacklem |
Fix the experimental NFS subsystem so that it uses the correct preprocessor macro name for not requiring strict data alignment.
Suggested by: marius MFC after: 2 weeks
|
205223 |
16-Mar-2010 |
jkim |
Fix a long standing regression of readdir(3) in fdescfs(5) introduced in r1.48. We were stopping at the first null pointer when multiple file descriptors were opened and one in the middle was closed. This restores traditional behaviour of fdescfs.
MFC after: 3 days
|
205014 |
11-Mar-2010 |
nwhitehorn |
Provide groundwork for 32-bit binary compatibility on non-x86 platforms, for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32 option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts of the kernel and enhances the freebsd32 compatibility code to support big-endian platforms.
Reviewed by: kib, jhb
|
205010 |
11-Mar-2010 |
rwatson |
Update nfsrv_getsocksndseq() for changes in TCP internals since FreeBSD 6.x:
- so_pcb is now guaranteed to be non-NULL and valid if a valid socket reference is held.
- Need to check INP_TIMEWAIT and INP_DROPPED before assuming inp_ppcb is a tcpcb, as it might be a tcptw or NULL otherwise.
- tp can never be NULL by the end of the function, so only check TCPS_ESTABLISHED before extracting tcpcb fields.
The NFS server arguably incorporates too many assumptions about TCP internals, but fixing that is left for nother day.
MFC after: 1 week Reviewed by: bz Reviewed and tested by: rmacklem Sponsored by: Juniper Networks
|
204675 |
03-Mar-2010 |
kib |
When returning error from msdosfs_lookup(), make sure that *vpp is NULL. lookup() KASSERTs this condition.
Reported and tested by: pho MFC after: 3 weeks
|
204589 |
02-Mar-2010 |
kib |
Do not leak vnode lock when msdosfs mount is updated and specified device is different from the device used to the original mount.
Note that update_mp does not need devvp locked, and pmp->pm_devvp cannot be freed meantime.
Reported and tested by: pho MFC after: 3 weeks
|
204576 |
02-Mar-2010 |
kib |
Only destroy pm_fatlock on error if it was initialized.
MFC after: 3 weeks
|
204475 |
28-Feb-2010 |
kib |
Mark msdosfs as mpsafe.
Tested by: pho MFC after: 3 weeks
|
204474 |
28-Feb-2010 |
kib |
Fix the race between dotdot lookup and forced unmount, by using msdosfs-specific variant of vn_vget_ino(), msdosfs_deget_dotdot().
As was done for UFS, relookup the dotdot denode after the call to msdosfs_deget_dotdot(), because vnode lock is dropped and directory might be moved.
Tested by: pho MFC after: 3 weeks
|
204473 |
28-Feb-2010 |
kib |
Use pm_fatlock to protect per-filesystem rb tree used to allocate fileno on the large FAT volumes. Previously, a single global mutex was used.
Tested by: pho MFC after: 3 weeks
|
204472 |
28-Feb-2010 |
kib |
Add assertions for FAT bitmap state.
Tested by: pho MFC after: 3 weeks
|
204471 |
28-Feb-2010 |
kib |
Use pm_fatlock to protect fat bitmap.
Tested by: pho MFC after: 3 weeks
|
204470 |
28-Feb-2010 |
kib |
Add per-mountpoint lockmgr lock for msdosfs. It is intended to be used as fat bitmap lock and to replace global mutex protecting fileno rbtree.
Tested by: pho MFC after: 3 weeks
|
204469 |
28-Feb-2010 |
kib |
In msdosfs deget(), properly handle the case when the vnode is found in hash.
Tested by: pho MFC after: 3 weeks
|
204468 |
28-Feb-2010 |
kib |
In msdosfs_inactive(), reclaim the vnodes both for SLOT_DELETED and SLOT_EMPTY deName[0] values. Besides conforming to FAT specification, it also clears the issue where vfs_hash_insert found the vnode in hash, and newly allocated vnode is vput()ed. There, deName[0] == 0, and vnode is not reclaimed, indefinitely kept on mountlist.
Tested by: pho MFC after: 3 weeks
|
204467 |
28-Feb-2010 |
kib |
Remove seemingly unneeded unlock/relock of the dvp in msdosfs_rmdir, causing LOR.
Reported and tested by: pho MFC after: 3 weeks
|
204466 |
28-Feb-2010 |
kib |
Assert that the msdosfs vnode is (e)locked in several places. The plan is to use vnode lock to protect denode and fat cache, and having separate lock for block use map.
Change the check and return on impossible condition into KASSERT().
Tested by: pho MFC after: 3 weeks
|
204465 |
28-Feb-2010 |
kib |
Remove unused global statistic about fat cache usage.
Tested by: pho MFC after: 3 weeks
|
204111 |
20-Feb-2010 |
uqs |
Fix common misspelling of hierarchy
Pointed out by: bf1783 at gmail Approved by: np (cxgb), kientzle (tar, etc.), philip (mentor)
|
203866 |
14-Feb-2010 |
kib |
Invalid filesystem might cause the bp to be never read.
Noted by: Pedro F. Giffuni <giffunip tutopia com> Obtanined from: NetBSD MFC after: 1 week
|
203849 |
14-Feb-2010 |
rmacklem |
Change the default value for vfs.newnfs.enable_locallocks to 0 for the experimental NFS server, since local locking is known to be broken and the patch to fix it is still a work in progress.
MFC after: 5 days
|
203848 |
14-Feb-2010 |
rmacklem |
This fixes the experimental NFS server so that it won't crash in the caching code for IPv6 by fixing a typo that used the incorrect variable. It also fixes the indentation of the statement above it.
Reported by: simon AT comsys.ntu-kpi.kiev.ua MFC after: 5 days
|
203828 |
13-Feb-2010 |
kib |
Fix function name in the comment in the second location too.
Submitted by: ed MFC after: 1 week
|
203827 |
13-Feb-2010 |
kib |
- Add idempotency guards so the structures can be used in other utilities. - Update bpb structs with reserved fields. - In direntry struct join deName with deExtension. Although a fix was attempted in the past, these fields were being overflowed, Now this is consistent with the spec, and we can now share the WinChksum code with NetBSD.
Submitted by: Pedro F. Giffuni <giffunip tutopia com> Mostly obtained from: NetBSD Reviewed by: bde MFC after: 2 weeks
|
203826 |
13-Feb-2010 |
kib |
Use M_ZERO instead of calling bzero(). Fix function name in the comment.
MFC after: 1 week
|
203822 |
13-Feb-2010 |
kib |
Remove unused macros.
MFC after: 1 week
|
203303 |
31-Jan-2010 |
rmacklem |
Patch the experimental NFS client so that there is a timeout for negative name cache entries in a manner analogous to r202767 for the regular NFS client. Also, make the code in nfs_lookup() compatible with that of the regular client and replace the sysctl variable that enabled negative name caching with the mount point option.
MFC after: 2 weeks
|
203292 |
31-Jan-2010 |
ed |
Properly use dev_refl()/dev_rel() in kern.devname.
While there, perform some clean-up fixes. Update some stale comments on struct cdev * instead of dev_t and devfs_random(). Also add some missing whitespace.
MFC after: 1 week
|
203164 |
29-Jan-2010 |
jh |
Add "maxfilesize" mount option for tmpfs to allow specifying the maximum file size limit. Default is UINT64_MAX when the option is not specified. It was useless to set the limit to the total amount of memory and swap in the system.
Use tmpfs_mem_info() rather than get_swpgtotal() in tmpfs_mount() to check if there is enough memory available.
Remove now unused get_swpgtotal().
Reviewed by: Gleb Kurtsou Approved by: trasz (mentor)
|
203119 |
28-Jan-2010 |
rmacklem |
Patch the experimental NFS client in a manner analogous to r203072 for the regular NFS client. Also, delete two fields of struct nfsmount that are not used by the FreeBSD port of the client.
MFC after: 2 weeks
|
203086 |
27-Jan-2010 |
trasz |
Don't touch v_interlock; use VI_* macros instead.
|
202903 |
23-Jan-2010 |
marius |
On LP64 struct ifid is 64-bit aligned while struct fid is 32-bit aligned so on architectures with strict alignment requirements we can't just simply cast the latter to the former but need to copy it bytewise instead.
PR: 143010 MFC after: 3 days
|
202783 |
22-Jan-2010 |
jh |
Truncate read request rather than returning EIO if the request is larger than MAXPHYS + 1. This fixes a problem with cat(1) when it uses a large I/O buffer.
Reported by: Fernando ApesteguÃa Suggested by: jilles Reviewed by: des Approved by: trasz (mentor)
|
202708 |
20-Jan-2010 |
jh |
- Change the type of nodes_max to u_int and use "%u" format string to convert its value. [1] - Set default tm_nodes_max to min(pages + 3, UINT32_MAX). It's more reasonable than the old four nodes per page (with page size 4096) because non-empty regular files always use at least one page. This fixes possible overflow in the calculation. [2] - Don't allow more than tm_nodes_max nodes allocated in tmpfs_alloc_node().
PR: kern/138367 Suggested by: bde [1], Gleb Kurtsou [2] Approved by: trasz (mentor)
|
202584 |
18-Jan-2010 |
lulf |
Revert parts of r202283: - Return EOPNOTSUPP before EROFS to be consistent with other filesystems. - Fix setting of the nodump flag for users without PRIV_VFS_SYSFLAGS privilege.
Submitted by: jh@
|
202283 |
14-Jan-2010 |
lulf |
Bring in the ext2fs work done by Aditya Sarawgi during and after Google Summer of Code 2009:
- BSDL block and inode allocation policies for ext2fs. This involves the use FFS1 style block and inode allocation for ext2fs. Preallocation was removed since it was GPL'd. - Make ext2fs MPSAFE by introducing locks to per-mount datastructures. - Fixes for kern/122047 PR. - Various small bugfixes. - Move out of gnu/ directory.
Sponsored by: Google Inc. Submitted by: Aditya Sarawgi <sarawgi.aditya AT SPAMFREE gmail DOT com>
|
202187 |
13-Jan-2010 |
jh |
- Fix some style bugs in tmpfs_mount(). [1] - Remove a stale comment about tmpfs_mem_info() 'total' argument.
Reported by: bde [1]
|
201954 |
09-Jan-2010 |
brooks |
Update the comment on printing group membership to reflect that fact that each groupt the process is a member of is printed rather than an entry for each group the user could be a member of.
MFC after: 3 days
|
201798 |
08-Jan-2010 |
trasz |
Remove unused smbfs_smb_qpathinfo().
|
201773 |
08-Jan-2010 |
jh |
- Change the type of size_max to u_quad_t because its value is converted with vfs_scanopt(9) using the "%qu" format string. - Limit the maximum value of size_max to (SIZE_MAX - PAGE_SIZE) to prevent overflow in howmany() macro.
PR: kern/141194 Approved by: trasz (mentor) MFC after: 2 weeks
|
201442 |
03-Jan-2010 |
rmacklem |
The test for "same client" for the experimental nfs server over NFSv4 was broken w.r.t. byte range lock conflicts when it was the same client and the request used the open_to_lock_owner4 case, since lckstp->ls_clp was not set. This patch fixes it by using "clp" instead of "lckstp->ls_clp".
MFC after: 2 weeks
|
201439 |
03-Jan-2010 |
rmacklem |
Fix three related problems in the experimental nfs client when checking for conflicts w.r.t. byte range locks for NFSv4. 1 - Return 0 instead of EACCES when a conflict is found, for F_GETLK. 2 - Check for "same file" when checking for a conflict. 3 - Don't check for a conflict for the F_UNLCK case.
|
201345 |
31-Dec-2009 |
rmacklem |
Fix the experimental NFS client so that it can create Unix domain sockets on an NFSv4 mount point. It was generating incorrect XDR in the request for this case.
Tested by: infofarmer MFC after: 2 weeks
|
201029 |
26-Dec-2009 |
rmacklem |
When porting the experimental nfs subsystem to the FreeBSD8 krpc, I added 3 functions that were already in the experimental client under different names. This patch deletes the functions in the experimental client and renames the calls to use the other set. (This is just removal of duplicated code and does not fix any bug.)
MFC after: 2 weeks
|
200999 |
25-Dec-2009 |
rmacklem |
Modify the experimental server so that it uses VOP_ACCESSX(). This is necessary in order to enable NFSv4 ACL support. The argument to nfsvno_accchk() was changed to an accmode_t and the function nfsrv_aclaccess() was no longer needed and, therefore, deleted.
Reviewed by: trasz MFC after: 2 weeks
|
200732 |
19-Dec-2009 |
ed |
Let access overriding to TTYs depend on the cdev_priv, not the vnode.
Basically this commit changes two things, which improves access to TTYs in exceptional conditions. Basically the problem was that when you ran jexec(8) to attach to a jail, you couldn't use /dev/tty (well, also the node of the actual TTY, e.g. /dev/pts/X). This is very inconvenient if you want to attach to screens quickly, use ssh(1), etc.
The fixes:
- Cache the cdev_priv of the controlling TTY in struct session. Change devfs_access() to compare against the cdev_priv instead of the vnode. This allows you to bypass UNIX permissions, even across different mounts of devfs.
- Extend devfs_prison_check() to unconditionally expose the device node of the controlling TTY, even if normal prison nesting rules normally don't allow this. This actually allows you to interact with this device node.
To be honest, I'm not really happy with this solution. We now have to store three pointers to a controlling TTY (s_ttyp, s_ttyvp, s_ttydp). In an ideal world, we should just get rid of the latter two and only use s_ttyp, but this makes certian pieces of code very impractical (e.g. devfs, kern_exit.c).
Reported by: Many people
|
200287 |
08-Dec-2009 |
delphij |
Allow using IPv6 in nfsrvd_sentcache() callback.
PR: kern/141289 Submitted by: Petr Lampa <lampa fit vutbr cz> Approved by: rmacklem MFC after: 1 week
|
200214 |
07-Dec-2009 |
guido |
Fix ntfs such that it understand media with a non-512-bytes sector size: 1. Fixups are always done on 512 byte chunks (in stead of sectors). This is kind of stupid. 2. Conevrt between NTFS blocknumbers (the blocksize equals the media sector size) and the bread() and getblk() blocknr (which are 512-byte sized)
NB: this change should not affect ntfs for 512-byte sector sizes.
|
200069 |
03-Dec-2009 |
trasz |
Remove unneeded ifdefs.
Reviewed by: rmacklem
|
200041 |
02-Dec-2009 |
trasz |
Don't use ap->a_td->td_ucred when we were passed ap->a_cred.
|
199715 |
23-Nov-2009 |
rmacklem |
Modify the experimental nfs server so that it falls back to using VOP_LOOKUP() when VFS_VGET() returns EOPNOTSUPP in the ReaddirPlus RPC. This patch is based upon one by pjd@ for the regular nfs server which has not yet been committed. It is needed when a ZFS volume is exported and ReaddirPlus (which almost always happens for NFSv4) is performed by a client. The patch also simplifies vnode lock handling somewhat.
MFC after: 2 weeks
|
199616 |
20-Nov-2009 |
rmacklem |
Patch the experimental NFS server is a manner analagous to r197525, so that the creation verifier is handled correctly in va_atime for 64bit architectures. There were two problems. One was that the code incorrectly assumed that sizeof (struct timespec) == 8 and the other was that the tv_sec field needs to be assigned from a signed 32bit integer, so that sign extension occurs on 64bit architectures. This is required for correct operation when exporting ZFS volumes.
Reviewed by: pjd MFC after: 2 weeks
|
199189 |
11-Nov-2009 |
jh |
Create verifier used by FreeBSD NFS client is suboptimal because the first part of a verifier is set to the first IP address from V_in_ifaddrhead list. This address is typically the loopback address making the first part of the verifier practically non-unique. The second part of the verifier is initialized to zero making its initial value non-unique too.
This commit changes the strategy for create verifier initialization: just initialize it to a random value. Also move verifier handling into its own function and use a mutex to protect the variable.
This change is a candidate for porting to sys/nfsclient.
Reviewed by: jhb, rmacklem Approved by: trasz (mentor)
|
199007 |
06-Nov-2009 |
attilio |
- Improve comments about locking of the "struct fifoinfo" which is a bit unclear. - Fix a memory leak [0]
[0] Diagnosed by: Dorr H. Clark <dclark at engr dot scu dot edu> MFC: 1 week
|
198494 |
26-Oct-2009 |
alc |
There is no need to "busy" a page when the object is locked for the duration of the operation.
|
198448 |
24-Oct-2009 |
ru |
Spell DIAGNOSTIC correctly.
|
198291 |
20-Oct-2009 |
jh |
Unloading of the nfscl module is unsupported because newnfslock doesn't support unloading. It's not trivial to implement newnfslock unloading so for now just admit that unloading is unsupported and refuse to attempt unload in all nfscl module event handlers.
Reviewed by: rmacklem Approved by: trasz (mentor)
|
198290 |
20-Oct-2009 |
jh |
Fix ordering of nfscl_modevent() and ncl_uninit(). nfscl_modevent() must be called after ncl_uninit() when unloading the nfscl module because ncl_uninit() uses ncl_iod_mutex which is destroyed in nfscl_modevent().
Reviewed by: rmacklem Approved by: trasz (mentor)
|
198289 |
20-Oct-2009 |
jh |
Fix comment typos.
Reviewed by: rmacklem Approved by: trasz (mentor)
|
197953 |
11-Oct-2009 |
delphij |
Add locking around access to parent node, and bail out when the parent node is already freed rather than panicking the system.
PR: kern/122038 Submitted by: gk Tested by: pho MFC after: 1 week
|
197850 |
07-Oct-2009 |
delphij |
Add a special workaround to handle UIO_NOCOPY case. This fixes data corruption observed when sendfile() is being used.
PR: kern/127213 Submitted by: gk MFC after: 2 weeks
|
197740 |
04-Oct-2009 |
delphij |
Fix a bug that causes the fsx test case of mmap'ed page being out of sync of read/write, inspired by ZFS's counterpart.
PR: kern/139312 Submitted by: gk@ MFC after: 1 week
|
197680 |
01-Oct-2009 |
trasz |
Provide default implementation for VOP_ACCESS(9), so that filesystems which want to provide VOP_ACCESSX(9) don't have to implement both. Note that this commit makes implementation of either of these two mandatory.
Reviewed by: kib
|
197650 |
30-Sep-2009 |
trasz |
Fix typo in the comment.
|
197428 |
23-Sep-2009 |
kib |
Add per-process osrel node to the procfs, to allow read and set p_osrel value for the process.
Approved by: des (procfs maintainer) MFC after: 3 weeks
|
197134 |
12-Sep-2009 |
rwatson |
Use C99 initialization for struct filterops.
Obtained from: Mac OS X Sponsored by: Apple Inc. MFC after: 3 weeks
|
197048 |
09-Sep-2009 |
rmacklem |
Add LK_NOWITNESS to the vn_lock() calls done on newly created nfs vnodes, since these nodes are not linked into the mount queue and, as such, the vn_lock() cannot cause a deadlock so LORs are harmless.
Suggested by: kib Approved by: kib (mentor) MFC after: 3 days
|
196970 |
08-Sep-2009 |
phk |
Revert previous commit and add myself to the list of people who should know better than to commit with a cat in the area.
|
196969 |
08-Sep-2009 |
phk |
Add necessary include.
|
196921 |
07-Sep-2009 |
kib |
If a race is detected, pfs_vncache_alloc() may reclaim a vnode that had never been inserted into the pfs_vncache list. Since pfs_vncache_free() does not anticipate this case, it decrements pfs_vncache_entries unconditionally; if the vnode was not in the list, pfs_vncache_entries will no longer reflect the actual number of list entries. This may cause size of the cache to exceed the configured maximum. It may also trigger a panic during module unload or system shutdown.
Do not decrement pfs_vncache_entries for the vnode that was not in the list.
Submitted by: tegge Reviewed by: des MFC after: 1 week
|
196920 |
07-Sep-2009 |
kib |
insmntque_stddtr() clears vp->v_data and resets vp->v_op to dead_vnodeops before calling vgone(). Revert r189706 and corresponding part of the r186560.
Noted and reviewed by: tegge Approved by: des (pseudofs part) MFC after: 3 days
|
196689 |
31-Aug-2009 |
kib |
Remove spurious pfs_unlock().
PR: kern/137310 Reviewed by: des MFC after: 3 days
|
196556 |
25-Aug-2009 |
jilles |
Fix poll() on half-closed sockets, while retaining POLLHUP for fifos.
This reverts part of r196460, so that sockets only return POLLHUP if both directions are closed/error. Fifos get POLLHUP by closing the unused direction immediately after creating the sockets.
The tools/regression/poll/*poll.c tests now pass except for two other things: - if POLLHUP is returned, POLLIN is always returned as well instead of only when there is data left in the buffer to be read - fifo old/new reader distinction does not work the way POSIX specs it
Reviewed by: kib, bde
|
196503 |
24-Aug-2009 |
zec |
Fix NFS panics with options VIMAGE kernels by apropriately setting curvnet context inside the RPC code.
Temporarily set td's cred to mount's cred before calling socreate() via __rpc_nconf2socket().
Submitted by: rmacklem (in part) Reviewed by: rmacklem, rwatson Discussed with: dfr, bz Approved by: re (rwatson), julian (mentor) MFC after: 3 days
|
196332 |
17-Aug-2009 |
rmacklem |
Apply the same patch as r196205 for nfs_upgrade_lock() and nfs_downgrade_lock() to the experimental nfs client.
Approved by: re (kensmith), kib (mentor)
|
196019 |
01-Aug-2009 |
rwatson |
Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes.
Reviewed by: bz Approved by: re (vimage blanket)
|
195995 |
31-Jul-2009 |
jhb |
Fix some LORs between vnode locks and filedescriptor table locks. - Don't grab the filedesc lock just to read fd_cmask. - Drop vnode locks earlier when mounting the root filesystem and before sanitizing stdin/out/err file descriptors during execve().
Submitted by: kib Approved by: re (rwatson) MFC after: 1 week
|
195943 |
29-Jul-2009 |
rmacklem |
Fix the experimental nfs client so that it only calls ncl_vinvalbuf() for NFSv2 and not NFSv4 when nfscl_mustflush() returns 0. Since nfscl_mustflush() only returns 0 when there is a valid write delegation issued to the client, it only affects the case of an NFSv4 mount with callbacks/delegations enabled.
Approved by: re (kensmith), kib (mentor)
|
195840 |
24-Jul-2009 |
jhb |
Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver.
Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
|
195825 |
22-Jul-2009 |
rmacklem |
When vfs.newnfs.callback_addr is set to an IPv4 address, the experimental NFSv4 client might try and use it as an IPv6 address, breaking callbacks. The fix simply initializes the isinet6 variable for this case.
Approved by: re (kensmith), kib (mentor)
|
195821 |
22-Jul-2009 |
rmacklem |
Add changes to the experimental nfs client to use the PBDRY flag for msleep(9) when a vnode lock or similar may be held. The changes are just a clone of the changes applied to the regular nfs client by r195703.
Approved by: re (kensmith), kib (mentor)
|
195819 |
22-Jul-2009 |
rmacklem |
When using an NFSv4 mount in the experimental nfs client with delegations being issued from the server, there was a case where an Open issued locally based on the delegation would be released before the associated vnode became inactive. If the delegation was recalled after the open was released, an Open against the server would not have been acquired and subsequent I/O operations would need to use the special stateid of all zeros. This patch fixes that case.
Approved by: re (kensmith), kib (mentor)
|
195762 |
19-Jul-2009 |
rmacklem |
Fix two bugs in the experimental nfs client: - When the root vnode was acquired during mounting, mnt_stat.f_iosize was still set to 0, so getnewvnode() would set bo_bsize == 0. This would confuse getblk(), so that it always returned the first block causing the problem when the root directory of the mount point was greater than one block in size. It was fixed by setting mnt_stat.f_iosize to NFS_DIRBLKSIZ before calling ncl_nget() to acquire the root vnode. - NFSMNT_INT was being set temporarily while the initial connect to a server was being done. This erroneously configured the krpc for interruptible RPCs, which caused problems because signals weren't being masked off as they would have been for interruptible mounts. This code was deleted to fix the problem. Since mount_nfs does an NFS null RPC before the mount system call, connections to the server should work ok.
Tested by: swell dot k at gmail dot com Approved by: re (kensmith), kib (mentor)
|
195704 |
14-Jul-2009 |
rmacklem |
Fix the experimental nfs client so that it does not cause a "share->excl" panic when doing a lookup of dotdot at the root of a server's file system. The patch avoids calling vn_lock() for that case, since nfscl_nget() has already acquired a lock for the vnode.
Approved by: re (kensmith), kib (mentor)
|
195699 |
14-Jul-2009 |
rwatson |
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables.
Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker.
Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided.
This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS.
Bump __FreeBSD_version and update UPDATING.
Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
|
195642 |
12-Jul-2009 |
rmacklem |
Add calls to the experimental nfs client for the case of an "intr" mount, so that signals that aren't supposed to terminate RPCs in progress are masked off during the RPC.
Approved by: re (kensmith), kib (mentor)
|
195641 |
12-Jul-2009 |
rmacklem |
Fix the handling of dotdot in lookup for the experimental nfs client in a manner analagous to the change in r195294 for the regular nfs client.
Approved by: re (kensmith), kib (mentor)
|
195510 |
09-Jul-2009 |
rmacklem |
Since the nfscl_getclose() function both decremented open counts and, optionally, created a separate list of NFSv4 opens to be closed, it was possible for the associated OpenOwner to be free'd before the Open was closed. The problem was that the Open was taken off the OpenOwner list before the Close RPC was done and OpenOwners can be free'd once the list is empty. This patch separates out the case of doing the Close RPC into a separate function called nfscl_doclose() and simplifies nfsrpc_doclose() so that it closes a single open instead of a list of them. This avoids removing the Open from the OpenOwner list before doing the Close RPC.
Approved by: re (kensmith), kib (mentor)
|
195423 |
07-Jul-2009 |
kib |
Fix poll(2) and select(2) for named pipes to return "ready for read" when all writers, observed by reader, exited. Use writer generation counter for fifo, and store the snapshot of the fifo generation in the f_seqcount field of struct file, that is otherwise unused for fifos. Set FreeBSD-undocumented POLLINIGNEOF flag only when file f_seqcount is equal to fifo' fi_wgen, and revert r89376.
Fix POLLINIGNEOF for sockets and pipes, and return POLLHUP for them. Note that the patch does not fix not returning POLLHUP for fifos.
PR: kern/94772 Submitted by: bde (original version) Reviewed by: rwatson, jilles Approved by: re (kensmith) MFC after: 6 weeks (might be)
|
195294 |
02-Jul-2009 |
kib |
In vn_vget_ino() and their inline equivalents, mnt_ref() the mount point around the sequence that drop vnode lock and then busies the mount point. Not having vlocked node or direct reference to the mp allows for the forced unmount to proceed, making mp unmounted or reused.
Tested by: pho Reviewed by: jeff Approved by: re (kensmith) MFC after: 2 weeks
|
194990 |
25-Jun-2009 |
kib |
Change the type of uio_resid member of struct uio from int to ssize_t. Note that this does not actually enable full-range i/o requests for 64 architectures, and is done now to update KBI only.
Tested by: pho Reviewed by: jhb, bde (as part of the review of the bigger patch)
|
194951 |
25-Jun-2009 |
rwatson |
Add a new global rwlock, in_ifaddr_lock, which will synchronize use of the in_ifaddrhead and INADDR_HASH address lists.
Previously, these lists were used unsynchronized as they were effectively never changed in steady state, but we've seen increasing reports of writer-writer races on very busy VPN servers as core count has gone up (and similar configurations where address lists change frequently and concurrently).
For the time being, use rwlocks rather than rmlocks in order to take advantage of their better lock debugging support. As a result, we don't enable ip_input()'s read-locking of INADDR_HASH until an rmlock conversion is complete and a performance analysis has been done. This means that one class of reader-writer races still exists.
MFC after: 6 weeks Reviewed by: bz
|
194766 |
23-Jun-2009 |
kib |
Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid.
The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup.
The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped.
The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4).
Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced.
In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
|
194601 |
21-Jun-2009 |
kib |
Add explicit struct ucred * argument for VOP_VPTOCNP, to be used by vn_open_cred in default implementation. Valid struct ucred is needed for audit and MAC, and curthread credentials may be wrong.
This further requires modifying the interface of vn_fullpath(9), but it is out of scope of this change.
Reviewed by: rwatson
|
194576 |
21-Jun-2009 |
rdivacky |
In non-debugging mode make this define (void)0 instead of nothing. This helps to catch bugs like the below with clang.
if (cond); <--- note the trailing ; something();
Approved by: ed (mentor) Discussed on: current@
|
194541 |
20-Jun-2009 |
rmacklem |
Replace RPCAUTH_UNIXGIDS with NFS_MAXGRPS so that nfscbd.c will build.
Approved by: kib (mentor)
|
194532 |
20-Jun-2009 |
ed |
Improve nested jail awareness of devfs by handling credentials.
Now that we start to use credentials on character devices more often (because of MPSAFE TTY), move the prison-checks that are in place in the TTY code into devfs.
Instead of strictly comparing the prisons, use the more common prison_check() function to compare credentials. This means that pseudo-terminals are only visible in devfs by processes within the same jail and parent jails.
Even though regular users in parent jails can now interact with pseudo-terminals from child jails, this seems to be the right approach. These processes are also capable of interacting with the jailed processes anyway, through signals for example.
Reviewed by: kib, rwatson (older version)
|
194523 |
20-Jun-2009 |
rmacklem |
Change the size of the nfsc_groups[] array in the experimental nfs client to RPCAUTH_UNIXGIDS + 1 (17), since that is what can go on the wire for AUTH_SYS authentication.
Reviewed by: brooks Approved by: kib (mentor)
|
194498 |
19-Jun-2009 |
brooks |
Rework the credential code to support larger values of NGROUPS and NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.)
The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc.
Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search.
Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error.
Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity.
Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867
|
194425 |
18-Jun-2009 |
alc |
Fix some of the style errors in *getpages().
|
194408 |
17-Jun-2009 |
rmacklem |
Add the SVC_RELEASE(xprt), as required by r194407.
Approved by: kib (mentor)
|
194368 |
17-Jun-2009 |
bz |
Add explicit includes for jail.h to the files that need them and remove the "hidden" one from vimage.h.
|
194363 |
17-Jun-2009 |
rmacklem |
Fix handling of ".." in nfs_lookup() for the forced dismount case by cribbing the change made to the regular nfs client in r194358.
Approved by: kib (mentor)
|
194357 |
17-Jun-2009 |
bz |
Add the explicit include of vimage.h to another five .c files still missing it.
Remove the "hidden" kernel only include of vimage.h from ip_var.h added with the very first Vimage commit r181803 to avoid further kernel poisoning.
|
194292 |
16-Jun-2009 |
rmacklem |
Remove the "int *" typecast for the aresid argument to vn_rdwr() and change the type of the argument from size_t to int. This should avoid issues on 64bit architectures.
Suggested by: kib Approved by: kib (mentor)
|
194124 |
13-Jun-2009 |
alc |
Eliminate unnecessary variables.
|
194118 |
13-Jun-2009 |
jamie |
Rename the host-related prison fields to be the same as the host.* parameters they represent, and the variables they replaced, instead of abbreviated versions of them.
Approved by: bz (mentor)
|
194117 |
13-Jun-2009 |
jamie |
Use getcredhostuuid instead of accessing the prison directly.
Approved by: bz (mentor)
|
194078 |
12-Jun-2009 |
jhb |
Update the inline version of vn_get_ino() for ".." lookups to match the recentish changes to vn_get_ino().
MFC after: 1 week
|
193955 |
10-Jun-2009 |
rmacklem |
This commit is analagous to r193952, but for the experimental nfs subsystem. Add a test for VI_DOOMED just after ncl_upgrade_vnlock() in ncl_bioread_check_cons(). This is required since it is possible for the vnode to be vgonel()'d while in ncl_upgrade_vnlock() when a forced dismount is in progress. Also, move the check for VI_DOOMED in ncl_vinvalbuf() down to after ncl_upgrade_vnlock() and replace the out of date comment for it.
Approved by: kib (mentor)
|
193930 |
10-Jun-2009 |
kib |
For cd9660_ioctl, check for recycled vnode after locking it.
Noted by: Jaakko Heinonen <jh saunalahti fi> MFC after: 2 weeks
|
193924 |
10-Jun-2009 |
kib |
Fix r193923 by noting that type of a_fp is struct file *, not int. It was assumed that r193923 was trivial change that cannot be done wrong.
MFC after: 2 weeks
|
193923 |
10-Jun-2009 |
kib |
s/a_fdidx/a_fp/ for VOP_OPEN comments that inline struct vop_open_args definition.
Discussed with: bde MFC after: 2 weeks
|
193922 |
10-Jun-2009 |
kib |
Remove unused VOP_IOCTL and VOP_KQFILTER implementations for fifofs.
MFC after: 2 weeks
|
193919 |
10-Jun-2009 |
kib |
VOP_IOCTL takes unlocked vnode as an argument. Due to this, v_data may be NULL or derefenced memory may become free at arbitrary moment.
Lock the vnode in cd9660, devfs and pseudofs implementation of VOP_IOCTL to prevent reclaim; check whether the vnode was already reclaimed after the lock is granted.
Reported by: georg at dts su Reviewed by: des (pseudofs) MFC after: 2 weeks
|
193837 |
09-Jun-2009 |
rmacklem |
Since vn_lock() with the LK_RETRY flag never returns an error for FreeBSD-CURRENT, the code that checked for and returned the error was broken. Change it to check for VI_DOOMED set after vn_lock() and return an error for that case. I believe this should only happen for forced dismounts.
Approved by: kib (mentor)
|
193735 |
08-Jun-2009 |
rmacklem |
Fix nfscl_getcl() so that it doesn't crash when it is called to do an NFSv4 Close operation with the cred argument NULL. Also, clarify what NULL arguments mean in the function's comment.
Approved by: kib (mentor)
|
193571 |
06-Jun-2009 |
rwatson |
Use #ifdef APPLE_MAC instead of #ifdef MAC to conditionalize Apple-specific behavior for unicode support in UDF so as not to conflict with the MAC Framework.
Note that Apple's XNU kernel also uses #ifdef MAC for the MAC Framework.
Suggested by: pjd MFC after: 3 days
|
193556 |
06-Jun-2009 |
des |
Drop Giant.
MFC after: 1 week
|
193511 |
05-Jun-2009 |
rwatson |
Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include.
Discussed with: pjd
|
193507 |
05-Jun-2009 |
rwatson |
Don't check MAC in the NFS server ACL set path, right now we aren't enforcing MAC for NFS clients.
|
193433 |
04-Jun-2009 |
rwatson |
Re-add opt_mac.h include, which is required in order for MNT_MULTILABEL to be set properly on devfs. Otherwise, it isn't possible to set labels on /dev nodes.
Reported by: Sergio Rodriguez <sergiorr at yahoo.com> MFC after: 3 days
|
193187 |
31-May-2009 |
alc |
nfs_write() can use the recently introduced vfs_bio_set_valid() instead of vfs_bio_set_validclean(), thereby avoiding the page queues lock.
Garbage collect vfs_bio_set_validclean(). Nothing uses it any longer.
|
193176 |
31-May-2009 |
kib |
Unlock the pseudofs vnode before calling fill method for pfs_readlink(). The fill code may need to lock another vnode, e.g. procfs file implementation.
Reviewed by: des Tested by: pho MFC after: 2 weeks
|
193175 |
31-May-2009 |
kib |
Implement the bypass routine for VOP_VPTOCNP in nullfs. Among other things, this makes procfs <pid>/file working for executables started from nullfs mount.
Tested by: pho PR: 94269, 104938
|
193173 |
31-May-2009 |
kib |
Do not drop vnode interlock in null_checkvp(). null_lock() verifies that v_data is not-null before calling NULLVPTOLOWERVP(), and dropping the interlock allows for reclaim to clean v_data and free the memory.
While there, remove unneeded semicolons and convert the infinite loops to panics. I have a will to remove null_checkvp() altogether, or leave it as a trivial stub, but not now.
Reported and tested by: pho
|
193172 |
31-May-2009 |
kib |
Lock the real null vnode lock before substitution of vp->v_vnlock. This should not really matter for correctness, since vp->v_lock is not locked before the call, and null_lock() holds the interlock, but makes the control flow for reclaim more clear.
Tested by: pho
|
193162 |
31-May-2009 |
zec |
Unbreak options VIMAGE kernel builds.
Approved by: julian (mentor)
|
193125 |
30-May-2009 |
rmacklem |
Add a check to v_type == VREG for the recently modified code that does NFSv4 Closes in the experimental client's VOP_INACTIVE(). I also replaced a bunch of ap->a_vp with a local copy of vp, because I thought that made it more readable.
Approved by: kib (mentor)
|
193092 |
30-May-2009 |
trasz |
Add VOP_ACCESSX, which can be used to query for newly added V* permissions, such as VWRITE_ACL. For a filsystems that don't implement it, there is a default implementation, which works as a wrapper around VOP_ACCESS.
Reviewed by: rwatson@
|
193066 |
29-May-2009 |
jamie |
Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible.
The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed.
Approved by: bz (mentor)
|
192986 |
28-May-2009 |
alc |
Make *getpages()s' assertion on the state of each page's dirty bits stricter.
|
192973 |
28-May-2009 |
des |
Use a temporary variable to avoid a duplicate strlen().
Submitted by: kib MFC after: 1 week
|
192928 |
27-May-2009 |
rmacklem |
Fix handling of NFSv4 Close operations in ncl_inactive(). Only do them for NFSv4 and flush writes to the server before doing the Close(s), as required. Also, use the a_td argument instead of curthread.
Approved by: kib (mentor)
|
192917 |
27-May-2009 |
alc |
Eliminate redundant setting of a page's valid bits and pointless clearing of the same page's dirty bits.
|
192898 |
27-May-2009 |
rmacklem |
Add a function to the experimental nfs subsystem that tests to see if a local file system supports NFSv4 ACLs. This allows the NFSHASNFS4ACL() macro to be correctly implemented. The NFSv4 ACL support should now work when the server exports a ZFS volume.
Approved by: kib (mentor)
|
192895 |
27-May-2009 |
jamie |
Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings.
Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge().
Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call.
Approved by: bz (mentor)
|
192861 |
26-May-2009 |
rmacklem |
Fix the experimental nfs subsystem so that it builds with the current NFSv4 ACLs, as defined in sys/acl.h. It still needs a way to test a mount point for NFSv4 ACL support before it will work. Until then, the NFSHASNFS4ACL() macro just always returns 0.
Approved by: kib (mentor)
|
192818 |
26-May-2009 |
trasz |
Adapt to the new ACL #define names.
Reviewed by: rmacklem@
|
192782 |
26-May-2009 |
rmacklem |
Add two sysctl variables to the experimental nfs server, so that the range of versions of NFS handled by the server can be limited. The nfsd daemon must be restarted after these sysctl variables are changed, in order for the change to take effect.
Approved by: kib (mentor)
|
192781 |
26-May-2009 |
rmacklem |
Fix the handling of NFSv4 Illegal Operation number to conform to RFC3530 (the operation number in the reply must be set to the value for OP_ILLEGAL). Also cleaned up some indentation.
Approved by: kib (mentor)
|
192780 |
26-May-2009 |
rmacklem |
Fix the experimental nfs server's interface to the new krpc so that it handles the case of a non-exported NFSv4 root correctly. Also, delete handling for the case where nd_repstat is already set in nfs_proc(), since that no longer happens.
Approved by: kib (mentor)
|
192707 |
25-May-2009 |
rmacklem |
Add NFSv4 root export checks to the DelegPurge, Renew and ReleaseLockOwner operations analagous to what is already in place for SetClientID and SetClientIDConfirm. These are the five NFSv4 operations that do not use file handle(s), so the checks are done using the NFSv4 root export entries in /etc/exports.
Approved by: kib (mentor)
|
192705 |
25-May-2009 |
rmacklem |
Temporarily #undef NFS4_ACL_EXTATTR_NAME, so that the experimental nfs subsystem will build while the NFSv4 ACL support is going into the kernel.
Approved by: kib (mentor)
|
192695 |
24-May-2009 |
rmacklem |
Crib the realign function out of nfs_krpc.c and add a call to it for the client side reply. Hopefully this fixes the problem with using the new krpc for arm for the experimental nfs client.
Approved by: kib (mentor)
|
192693 |
24-May-2009 |
rmacklem |
Fix the experimental NFSv4 server so that it handles the case where a client is not allowed NFSv4 access correctly. This restriction is specified in the "V4: ..." line(s) in /etc/exports.
Approved by: kib (mentor)
|
192675 |
24-May-2009 |
rmacklem |
Fix the experimental nfsv4 client so that it works for the case of a kerberized mount without a host based principal name. This will only work for mounts being done by a user other than root. Support for a host based principal name will not work until proposed changes to the rpcsec_gss part of the krpc are committed. It now builds for "options KGSSAPI".
Approved by: kib (mentor)
|
192657 |
23-May-2009 |
alc |
Eliminate the unnecessary clearing of a page's dirty bits from nwfs_getpages().
|
192616 |
23-May-2009 |
rmacklem |
Fix the rpc_gss_secfind() call in nfs_commonkrpc.c so that the code will build when "options KGSSAPI" is specified without requiring the proposed changes that add host based initiator principal support. It will not handle the case where the client uses a host based initiator principal until those changes are committed. The code that uses those changes is #ifdef'd notyet until the krpc rpcsec_changes are committed.
Approved by: kib (mentor)
|
192613 |
22-May-2009 |
rmacklem |
Change the sysctl_base argument to svcpool_create() to NULL for client side callbacks so that leaf names are not re-used, since they are already being used by the server.
Approved by: kib (mentor)
|
192601 |
22-May-2009 |
rmacklem |
Fix the name of the module common to the client and server in the experimental nfs subsystem to the correct one for the MODULE_DEPEND() macro.
Approved by: kib (mentor)
|
192596 |
22-May-2009 |
rmacklem |
Change the printf of r192595 to identify the function, as requested by Sam.
Approved by: kib (mentor)
|
192591 |
22-May-2009 |
rmacklem |
Modified the printf message of r192590 to remove the possible DOS attack, as suggested by Sam.
Approved by: kib (mentor)
|
192589 |
22-May-2009 |
rmacklem |
Change the comment at the beginning of the function to reflect the change from panic() to printf() done by r192588.
|
192588 |
22-May-2009 |
rmacklem |
Change the reboot panic that would have occurred if clientid numbers wrapped around to a printf() warning of a possible DOS attack, in the experimental nfsv4 server.
Approved by: kib (mentor)
|
192585 |
22-May-2009 |
rmacklem |
Modify the mount handling code in the experimental nfs client to use the newer nmount() style arguments, as is used by mount_nfs.c. This prepares the kernel code for the use of a mount_nfs.c with changes for the experimental client integrated into it.
Approved by: kib (mentor)
|
192582 |
22-May-2009 |
rmacklem |
Change the code in the experimental nfs client to avoid flushing writes upon close when a write delegation is held by the client. This should be safe to do, now that nfsv4 Close operations are delayed until ncl_inactive() is called for the vnode.
Approved by: kib (mentor)
|
192581 |
22-May-2009 |
rmacklem |
Fix the comment in sys/fs/nfs/nfs.h to correctly reflect the current use of the R_xxx flags. This changed when the NFS_LEGACYRPC code was removed from the subsystem.
Approved by: kib (mentor)
|
192578 |
22-May-2009 |
rwatson |
Remove the unmaintained University of Michigan NFSv4 client from 8.x prior to 8.0-RELEASE. Rick Macklem's new and more feature-rich NFSv234 client and server are replacing it.
Discussed with: rmacklem
|
192574 |
22-May-2009 |
rmacklem |
Fix the experimental nfs server so that it depends on the nlm, since it now calls nlm_acquire_next_sysid().
Approved by: kib (mentor)
|
192539 |
21-May-2009 |
rmacklem |
Fix the comment at line 3711 to be consistent with the change applied for r192537.
Approved by: kib (mentor)
|
192503 |
21-May-2009 |
rmacklem |
Modify sys/fs/nfsserver/nfs_nfsdport.c to use nlm_acquire_next_sysid() to set the l_sysid for locks correctly.
Approved by: kib (mentor)
|
192463 |
20-May-2009 |
rmacklem |
Although it should never happen, all the nfsv4 server can do when it runs out of clientids is reboot. I had replaced cpu_reboot() with printf(), since cpu_reboot() doesn't exist for sparc64. This change replaces the printf() with panic(), so the reboot would occur for this highly unlikely occurrence.
Approved by: kib (mentor)
|
192337 |
18-May-2009 |
rmacklem |
Change the experimental NFSv4 client so that it does not do the NFSv4 Close operations until ncl_inactive(). This is necessary so that the Open StateIDs are available for doing I/O on mmap'd files after VOP_CLOSE(). I also changed some indentation for the nfscl_getclose() function.
Approved by: kib (mentor)
|
192256 |
17-May-2009 |
rmacklem |
Fix the acquisition of local locks via VOP_ADVLOCK() by the experimental nfsv4 server. It was setting the a_id argument to a fixed value, but that wasn't sufficient for FreeBSD8. Instead, set l_pid and l_sysid to 0 plus set the F_REMOTE flag to indicate that these fields are used to check for same lock owner. Since, for NFSv4, a lockowner is a ClientID plus an up to 1024byte name, it can't be put in l_sysid easily. I also renamed the p variable to td, since it's a thread ptr.
Approved by: kib (mentor)
|
192255 |
17-May-2009 |
rmacklem |
Added a SYSCTL to sys/fs/nfsserver/nfs_nfsdport.c so that the value of nfsrv_dolocallocks can be changed via sysctl. I also added some non-empty descriptor strings and reformatted some overly long lines.
Approved by: kib (mentor)
|
192245 |
17-May-2009 |
alc |
Merge r191964: Eliminate a case of unnecessary page queues locking.
|
192231 |
16-May-2009 |
rmacklem |
Changed sys/fs/nfs_clbio.c in the same way Alan Cox changed sys/nfsclient/nfs_bio.c for r192134, so that the sources stay in sync.
Approved by: kib (mentor)
|
192181 |
16-May-2009 |
rmacklem |
Fixed the Null callback RPCs so that they work with the new krpc. This required two changes: setting the program and version numbers before connect and fixing the handling of the Null Rpc case in newnfs_request().
Approved by: kib (mentor)
|
192152 |
15-May-2009 |
rmacklem |
Move the nfsstat structure and proc/op number definitions on the experimental nfs subsystem from sys/fs/nfs/nfs.h and sys/fs/nfs/nfsproto.h to sys/fs/nfs/nfsport.h and rename nfsstat to ext_nfsstat. This was done so that src/usr.bin/nfsstat.c could use it alongside the regular nfs include files and struct nfsstat.
Approved by: kib (mentor)
|
192151 |
15-May-2009 |
kib |
Devfs replaces file ops vector with devfs-specific one in devfs_open(), before the struct file is fully initialized in vn_open(), in particular, fp->f_vnode is NULL. Other thread calling file operation before f_vnode is set results in NULL pointer dereference in devvn_refthread().
Initialize f_vnode before calling d_fdopen() cdevsw method, that might set file ops too.
Reported and tested by: Chris Timmons <cwt networks cwu edu> (RELENG_7 version) MFC after: 3 days
|
192145 |
15-May-2009 |
rmacklem |
Modify the diskless booting code in sys/fs/nfsclient to be compatible with what is in sys/nfsclient, so that it will at least build now.
Approved by: kib (mentor)
|
192134 |
15-May-2009 |
alc |
Eliminate unnecessary clearing of the page's dirty mask from various getpages functions.
Eliminate a stale comment.
|
192121 |
14-May-2009 |
rmacklem |
Apply changes to the experimental nfs server so that it uses the security flavors as exported in FreeBSD-CURRENT. This allows it to use a slightly modified mountd.c instead of a different utility.
Approved by: kib (mentor)
|
192115 |
14-May-2009 |
rmacklem |
Change the file names in the comments in sys/fs/nfs/nfs_var.h so that they are the names used in FreeBSD-CURRENT. Also shuffled a few entries around, so that they under the correct comment.
Approved by: kib (mentor)
|
192065 |
13-May-2009 |
rmacklem |
Apply a one line change to nfs_clbio.c (which is largely a copy of sys/nfsclient/nfs_bio.c) to track the change recently committed by acl for nfs_bio.c.
Approved by: kib (mentor)
|
192017 |
12-May-2009 |
rmacklem |
Modify the experimental nfs server to use the new nfsd_nfsd_args structure for nfsd. Includes a change that clarifies the use of an empty principal name string to indicate AUTH_SYS only.
Approved by: kib (mentor)
|
192013 |
12-May-2009 |
kib |
Report all fdescfs vnodes as VCHR for stat(2). Fake the unique major/minor numbers of the devices.
Pretending that the vnodes are character devices prevents file tree walkers from descending into the directories opened by current process. Also, not doing stat on the filedescriptors prevents the recursive entry into the VFS.
Requested by: kientzle Discussed with: Jilles Tjoelker <jilles stack nl>
|
192012 |
12-May-2009 |
kib |
Return controlled EINVAL when the fdescfs lookup routine is given string representing too large integer, instead of overflowing and possibly returning a random but valid vnode.
Noted by: Jilles Tjoelker <jilles stack nl> MFC after: 3 days
|
192010 |
12-May-2009 |
alc |
Eliminate gratuitous clearing of the page's dirty mask.
|
192000 |
11-May-2009 |
rmacklem |
Change the name of the nfs server addsock structure from nfsd_args to nfsd_addsock_args, so that it is consistent with the one in sys/nfsserver/nfs.h.
Approved by: kib (mentor)
|
191998 |
11-May-2009 |
rmacklem |
Modify nfsvno_fhtovp() to ensure that it always sets the credp argument. Returning without credp set could result in a caller doing crfree() on garbage.
Reviewed by: kan Approved by: kib (mentor)
|
191990 |
11-May-2009 |
attilio |
Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread.
In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP.
While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option.
VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.
|
191964 |
10-May-2009 |
alc |
Eliminate stale comments.
Eliminate a case of unnecessary page queues locking.
|
191940 |
09-May-2009 |
kan |
Do not embed struct ucred into larger netcred parent structures.
Credential might need to hang around longer than its parent and be used outside of mnt_explock scope controlling netcred lifetime. Use separate reference-counted ucred allocated separately instead.
While there, extend mnt_explock coverage in vfs_stdexpcheck and clean-up some unused declarations in new NFS code.
Reported by: John Hickey PR: kern/133439 Reviewed by: dfr, kib
|
191783 |
04-May-2009 |
rmacklem |
Add the experimental nfs subtree to the kernel, that includes support for NFSv4 as well as NFSv2 and 3. It lives in 3 subdirs under sys/fs: nfs - functions that are common to the client and server nfsclient - a mutation of sys/nfsclient that call generic functions to do RPCs and handle state. As such, it retains the buffer cache handling characteristics and vnode semantics that are found in sys/nfsclient, for the most part. nfsserver - the server. It includes a DRC designed specifically for NFSv4, that is used instead of the generic DRC in sys/rpc. The build glue will be checked in later, so at this point, it consists of 3 new subdirs that should not affect kernel building.
Approved by: kib (mentor)
|
190888 |
10-Apr-2009 |
rwatson |
Remove VOP_LEASE and supporting functions. This hasn't been used since the removal of NQNFS, but was left in in case it was required for NFSv4. Since our new NFSv4 client and server can't use it for their requirements, GC the old mechanism, as well as other unused lease- related code and interfaces.
Due to its impact on kernel programming and binary interfaces, this change should not be MFC'd.
Proposed by: jeff Reviewed by: jeff Discussed with: rmacklem, zach loafman @ isilon
|
190839 |
08-Apr-2009 |
des |
Remove spurious locking in pfs_write().
Reported by: Andrew Brampton <me@bramp.net> MFC after: 1 week
|
190806 |
07-Apr-2009 |
des |
Fix an inverted KASSERT. Add similar assertions in other similar places.
Reported by: Andrew Brampton <me@bramp.net> MFC after: 1 week
|
189961 |
18-Mar-2009 |
pho |
Do not use null_bypass for VOP_ISLOCKED, directly call default implementation. null_bypass cannot work for the !nullfs-vnodes, in particular, for VBAD vnodes.
In collaboration with: kib
|
189758 |
13-Mar-2009 |
attilio |
Remove the null_islocked() overloaded vop because the standard one does the same.
|
189696 |
11-Mar-2009 |
jhb |
Add a new internal mount flag (MNTK_EXTENDED_SHARED) to indicate that a filesystem supports additional operations using shared vnode locks. Currently this is used to enable shared locks for open() and close() of read-only file descriptors. - When an ISOPEN namei() request is performed with LOCKSHARED, use a shared vnode lock for the leaf vnode only if the mount point has the extended shared flag set. - Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but not O_CREAT. - Use a shared vnode lock around VOP_CLOSE() if the file was opened with O_RDONLY and the mountpoint has the extended shared flag set. - Adjust md(4) to upgrade the vnode lock on the vnode it gets back from vn_open() since it now may only have a shared vnode lock. - Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since FIFO's require exclusive vnode locks for their open() and close() routines. (My recent MPSAFE patches for UDF and cd9660 already included this change.) - Enable extended shared operations on UFS, cd9660, and UDF.
Submitted by: ups Reviewed by: pjd (ZFS bits) MFC after: 1 month
|
189693 |
11-Mar-2009 |
kib |
Enable advisory file locking for devfs vnodes.
Reported by: Timothy Redaelli <timothy redaelli eu> MFC after: 1 week
|
189622 |
10-Mar-2009 |
kib |
Do not use bypass for vop_vptocnp() from nullfs, call standard implementation instead. The bypass does not assume that returned vnode is only held.
Reported by: Paul B. Mahol <onemda gmail com>, pluknet <pluknet gmail com> Reviewed by: jhb Tested by: pho, pluknet <pluknet gmail com>
|
189450 |
06-Mar-2009 |
kib |
Extract the no_poll() and vop_nopoll() code into the common routine poll_no_poll(). Return a poll_no_poll() result from devfs_poll_f() when filedescriptor does not reference the live cdev, instead of ENXIO.
Noted and tested by: hps MFC after: 1 week
|
189364 |
04-Mar-2009 |
avg |
udf: use truly unique directory cookie
'off' is an offset within current block, so there is a good chance it can be non-unique, so use complete offset.
Submitted by: bde Approved by: jhb
|
189363 |
04-Mar-2009 |
avg |
udf_strategy: remove redundant comment
We fail mapping for any udf_bmap_internal error and there can be different reasons for it, so no need to (over-)emphasize files with data in fentry.
Submitted by: bde Approved by: jhb
|
189302 |
03-Mar-2009 |
avg |
udf_readdir: do not advance offset if entry can not be uio-ed
Previosly readdir missed some directory entries because there was no space for them in current uio but directory stream offset was advanced nevertheless. jhb has discoved the issue and provided a test-case.
Reviewed by: bde Approved by: jhb (mentor)
|
189282 |
02-Mar-2009 |
kib |
Use the p_sysent->sv_flags flag SV_ILP32 to detect 32bit process executing on 64bit kernel. This eliminates the direct comparisions of p_sysent with &ia32_freebsd_sysvec, that were left intact after r185169.
|
189120 |
27-Feb-2009 |
jhb |
- Hold a reference on the cdev a filesystem is mounted from in the mount. - Remove the cdev pointers from the denode and instead use the mountpoint's reference to call dev2udev() in getattr().
Reviewed by: kib, julian
|
189111 |
27-Feb-2009 |
avg |
udf_readatoffset: return correct size and data pointer for data in fentry
This should help correct reading of directories with data located in fentry.
Submitted by: bde Approved by: jhb (mentor)
|
189082 |
26-Feb-2009 |
avg |
udf_readatoffset: read through directory vnode, do not read > MAXBSIZE
Currently bread()-ing through device vnode with (1) VMIO enabled, (2) bo_bsize != DEV_BSIZE (3) more than 1 block results in data being incorrectly cached. So instead a more common approach of using a vnode belonging to fs is now employed. Also, prevent attempt to bread more than MAXBSIZE bytes because of adjustments made to account for offset that doesn't start on block boundary. Add expanded comments to explain the calculations. Also drop unused inline function while here.
PR: kern/120967 PR: kern/129084
Reviewed by: scottl, kib Approved by: jhb (mentor)
|
189070 |
26-Feb-2009 |
avg |
udf: add read-ahead support modeled after cd9660
Reviewed by: scottl Approved by: jhb (mentor)
|
189069 |
26-Feb-2009 |
avg |
udf_map: return proper error code instead of leaking an internal one
Incidentally this also allows for small files with data embedded into fentry to be mmap-ed.
Approved by: jhb (mentor)
|
189068 |
26-Feb-2009 |
avg |
udf_read: correctly read data from files with data embedded into fentry,
... as opposed to files with data in extents. Some UDF authoring tools produce this type of file for sufficiently small data files.
Approved by: jhb (mentor)
|
189067 |
26-Feb-2009 |
avg |
udf_strategy: tiny optimization of logic, calculations; extra diagnostics
Use bit-shift instead of division/multiplication. Act on error as soon as it is detected. Report attempt to read data embedded in file entry via regular way. While there, fix lblktosize macro and make use of it.
No functionality should change as a result.
Approved by: jhb (mentor)
|
188956 |
23-Feb-2009 |
trasz |
Right now, when trying to unmount a device that's already gone, msdosfs_unmount() and ffs_unmount() exit early after getting ENXIO. However, dounmount() treats ENXIO as a success and proceeds with unmounting. In effect, the filesystem gets unmounted without closing GEOM provider etc.
Reviewed by: kib Approved by: rwatson (mentor) Tested by: dho Sponsored by: FreeBSD Foundation
|
188929 |
22-Feb-2009 |
alc |
Use uiomove_fromphys() instead of the combination of sf_buf and uiomove().
This is not only shorter; it also eliminates unnecessary thread pinning on architectures that implement a direct map.
MFC after: 3 weeks
|
188921 |
22-Feb-2009 |
alc |
Simplify the unwiring and activation of pages.
MFC after: 1 week
|
188816 |
19-Feb-2009 |
avg |
style nit in r188815
Pointed out by: jhb, rpaulo Approved by: jhb (mentor)
|
188815 |
19-Feb-2009 |
avg |
fs/udf: fix incorrect error return (-1) when reading a large dir
Not enough space in user-land buffer is not an error, userland will read further until eof is reached. So instead of propagating -1 to caller we convert it to zero/success.
cd9660 code works exactly the same way.
PR: kern/78987 Reviewed by: jhb (mentor) Approved by: jhb (mentor)
|
188677 |
16-Feb-2009 |
des |
Fix a logic bug that caused the pfs_attr method to be called only for PFS_PROCDEP nodes.
Submitted by: Andrew Brampton <brampton@gmail.com> MFC after: 2 weeks
|
188588 |
13-Feb-2009 |
jhb |
Use shared vnode locks when invoking VOP_READDIR().
MFC after: 1 month
|
188502 |
11-Feb-2009 |
jhb |
- Consolidate error handling in the cd9660 and udf mount routines. - Always read the character device pointer while the associated devfs vnode is locked. Also, use dev_ref() to obtain a new reference on the vnode for the mountpoint. This reference is released on unmount. This mirrors the earlier fix to FFS.
Reviewed by: kib
|
188407 |
09-Feb-2009 |
jhb |
Mark udf(4) MPSAFE and add support for shared vnode locks during pathname lookups: - Honor the caller's locking flags in udf_root() and udf_vget(). - Set VV_ROOT for the root vnode in udf_vget() instead of only doing it in udf_root(). - Honor the requested locking flags during pathname lookups in udf_lookup(). - Release the buffer holding the directory data before looking up the vnode for a given file to avoid a LOR between the "udf" vnode locks and "bufwait". - Use vn_vget_ino() to handle ".." lookups. - Special case "." lookups instead of calling udf_vget(). We have to do extra checking for the vnode lock for "." lookups.
|
188406 |
09-Feb-2009 |
jhb |
Use the same style as the rest of the file for the optional data string after each path component rather than a GCC-ism.
|
188318 |
08-Feb-2009 |
kib |
Lookup up the directory entry for the tmpfs node that are deleted by both node pointer and name component. This does the right thing for hardlinks to the same node in the same directory.
Submitted by: Yoshihiro Ota <ota j email ne jp> PR: kern/131356 MFC after: 2 weeks
|
188251 |
06-Feb-2009 |
jhb |
Add rudimentary support for symbolic links on UDF. Links are stored as a sequence of pathname components. We walk the list building a string in the caller's passed in buffer. Currently this only handles path names in CS8 (character set 8) as that is what mkisofs generates for UDF images.
MFC after: 1 month
|
188245 |
06-Feb-2009 |
jhb |
Add support for fifos to UDF: - Add a separate set of vnode operations that inherits from the fifo ops and use it for fifo nodes. - Add a VOP_SETATTR() method that allows setting the size (by silently ignoring the requests) of fifos. This is to allow O_TRUNC opens of fifo devices (e.g. I/O redirection in shells using ">"). - Add a VOP_PRINT() handler while I'm here.
|
188244 |
06-Feb-2009 |
jhb |
Tweak the output of VOP_PRINT/vn_printf() some. - Align the fifo output in fifo_print() with other vn_printf() output. - Remove the leading space from lockmgr_printinfo() so its output lines up in vn_printf(). - lockmgr_printinfo() now ends with a newline, so remove an extra newline from vn_printf().
|
187960 |
31-Jan-2009 |
bz |
After r186194 the *fs_strategy() functions always return 0. So we are no longer interested in the error returned from the *fs_doio() functions. With that we can remove the error variable as its value is unused now.
Submitted by: Christoph Mallon christoph.mallon@gmx.de
|
187959 |
31-Jan-2009 |
bz |
Remove unused local variables.
Submitted by: Christoph Mallon christoph.mallon@gmx.de Reviewed by: kib MFC after: 2 weeks
|
187864 |
28-Jan-2009 |
ed |
Mark most often used sysctl's as MPSAFE.
After running a `make buildkernel', I noticed most of the Giant locks in sysctl are only caused by a very small amount of sysctl's:
- sysctl.name2oid. This one is locked by SYSCTL_LOCK, just like sysctl.oidfmt.
- kern.ident, kern.osrelease, kern.version, etc. These are just constant strings.
- kern.arandom, used by the stack protector. It is already protected by arc4_mtx.
I also saw the following sysctl's show up. Not as often as the ones above, but still quite often:
- security.jail.jailed. Also mark security.jail.list as MPSAFE. They don't need locking or already use allprison_lock.
- kern.devname, used by devname(3), ttyname(3), etc.
This seems to reduce Giant locking inside sysctl by ~75% in my primitive test setup.
|
187840 |
28-Jan-2009 |
imp |
Use the correct field name for the size of the sierra_id. While this is the same size as id, and is unlikely to change, it seems better to use the correct field here. There's no difference in the generated code.
|
187838 |
28-Jan-2009 |
jhb |
Mark cd9660 MPSAFE and add support for using shared vnode locks during pathname lookups. - Remove 'i_offset' and 'i_ino' from the ISO node structure and replace them with local variables in the lookup routine instead. - Cache a copy of 'i_diroff' for use during a lookup in a local variable. - Save a copy of the found directory entry in a malloc'd buffer after a successfull lookup before getting the vnode. This allows us to release the buffer holding the directory block before calling vget() which otherwise resulted in a LOR between "bufwait" and the vnode lock. - Use an inlined version of vn_vget_ino() to handle races with .. lookups. I had to inline the code here since cd9660 uses an internal vget routine to save a disk I/O that would otherwise re-read the directory block. - Honor the requested locking flags during lookups to allow for shared locking. - Honor the requested locking flags passed to VFS_ROOT() and VFS_VGET() similar to UFS. - Don't make every ISO 9660 vnode hold a reference on the vnode of the underlying device vnode of the mountpoint. The mountpoint already holds a suitable reference.
|
187836 |
28-Jan-2009 |
jhb |
Sync with ufs_vnops.c:1.245 and remove support for accessing device nodes in ISO 9660 filesystems.
|
187832 |
28-Jan-2009 |
jhb |
Assert an exclusive vnode lock for fifo_cleanup() and fifo_close() since they change v_fifoinfo.
Discussed with: ups (a while ago)
|
187830 |
28-Jan-2009 |
ed |
Last step of splitting up minor and unit numbers: remove minor().
Inside the kernel, the minor() function was responsible for obtaining the device minor number of a character device. Because we made device numbers dynamically allocated and independent of the unit number passed to make_dev() a long time ago, it was actually a misnomer. If you really want to obtain the device number, you should use dev2udev().
We already converted all the drivers to use dev2unit() to obtain the device unit number, which is still used by a lot of drivers. I've noticed not a single driver passes NULL to dev2unit(). Even if they would, its behaviour would make little sense. This is why I've removed the NULL check.
Ths commit removes minor(), minor2unit() and unit2minor() from the kernel. Because there was a naming collision with uminor(), we can rename umajor() and uminor() back to major() and minor(). This means that the makedev(3) manual page also applies to kernel space code now.
I suspect umajor() and uminor() isn't used that often in external code, but to make it easier for other parties to port their code, I've increased __FreeBSD_version to 800062.
|
187715 |
26-Jan-2009 |
kib |
The kernel may do unbalanced calls to fifo_close() for fifo vnode, without corresponding number of fifo_open(). This causes assertion failure in fifo_close() due to vp->v_fifoinfo being NULL for kernel with INVARIANTS, or NULL pointer dereference otherwise. In fact, we may ignore excess calls to fifo_close() without bad consequences.
Turn KASSERT() into the return, and print warning for now.
Tested by: pho Reviewed by: rwatson MFC after: 2 weeks
|
187199 |
13-Jan-2009 |
trasz |
Turn a "panic: non-decreasing id" into an error printf. This seems to be caused by a metadata corruption that occurs quite often after unplugging a pendrive during write activity.
Reviewed by: scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
|
187058 |
11-Jan-2009 |
trasz |
Fix msdosfs_print(), which in turn fixes "show lockedvnods" for msdosfs vnodes.
Reviewed by: kib Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
|
186981 |
09-Jan-2009 |
marcus |
Fix a deadlock which can occur due to a pseudofs vnode not getting unlocked.
Reported by: Richard Todd <rmtodd@ichotolot.servalan.com> Reviewed by: kib Approved by: kib
|
186911 |
08-Jan-2009 |
trasz |
Don't panic with "vinvalbuf: dirty bufs" when the mounted device that was being written to goes away.
Reviewed by: kib, scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
|
186617 |
30-Dec-2008 |
marcus |
Add a VOP_VPTOCNP implementation for pseudofs which covers file systems such as procfs and linprocfs.
This implementation's locking was enhanced by kib.
Reviewed by: kib des Approved by: des kib Tested by: pho
|
186565 |
29-Dec-2008 |
kib |
When the insmntque() in the pfs_vncache_alloc() fails, vop_reclaim calls pfs_vncache_free() that removes pvd from the list, while it is not yet put on the list.
Prevent the invalid removal from the list by clearing pvd_next and pvd_prev for the newly allocated pvd, and only move pfs_vncache list head when the pvd was at the head.
Suggested and approved by: des MFC after: 2 weeks
|
186563 |
29-Dec-2008 |
kib |
vm_map_lock_read() does not increment map->timestamp, so we should compare map->timestamp with saved timestamp after map read lock is reacquired, not with saved timestamp + 1. The only consequence of the +1 was unconditional lookup of the next map entry, though.
Tested by: pho Approved by: des MFC after: 2 weeks
|
186562 |
29-Dec-2008 |
kib |
Use curproc->p_sysent->sv_flags bit SV_ILP32 for detection of the 32 bit caller, instead of direct comparision with ia32_freebsd_sysvec.
Tested by: pho Approved by: des MFC after: 2 weeks
|
186561 |
29-Dec-2008 |
kib |
Drop the pseudofs vnode lock around call to pfs_read handler. The handler may need to lock arbitrary vnodes, causing either lock order reversal or recursive vnode lock acquisition.
Tested by: pho Approved by: des MFC after: 2 weeks
|
186560 |
29-Dec-2008 |
kib |
After the pfs_vncache_mutex is dropped, another thread may attempt to do pfs_vncache_alloc() for the same pfs_node and pid. In this case, we could end up with two vnodes for the pair. Recheck the cache under the locked pfs_vncache_mutex after all sleeping operations are done [1].
This case mostly cannot happen now because pseudofs uses exclusive vnode locking for lookup. But it does drop the vnode lock for dotdot lookups, and Marcus' pseudofs_vptocnp implementation is vulnerable too.
Do not call free() on the struct pfs_vdata after insmntque() failure, because vp->v_data points to the structure, and pseudofs_reclaim() frees it by the call to pfs_vncache_free().
Tested by: pho [1] Approved by: des MFC after: 2 weeks
|
186194 |
16-Dec-2008 |
trasz |
According to phk@, VOP_STRATEGY should never, _ever_, return anything other than 0. Make it so. This fixes "panic: VOP_STRATEGY failed bp=0xc320dd90 vp=0xc3b9f648", encountered when writing to an orphaned filesystem. Reason for the panic was the following assert: KASSERT(i == 0, ("VOP_STRATEGY failed bp=%p vp=%p", bp, bp->b_vp)); at vfs_bio:bufstrategy().
Reviewed by: scottl, phk Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
|
185984 |
12-Dec-2008 |
kib |
Reference the vmspace of the process being inspected by procfs, linprocfs and sysctl kern_proc_vmmap handlers.
Reported and tested by: pho Reviewed by: rwatson, des MFC after: 1 week
|
185980 |
12-Dec-2008 |
kib |
Do not leak defs_de_interlock on error.
Another pointy hat for my collection.
|
185959 |
12-Dec-2008 |
marcus |
Implement VOP_VPTOCNP for devfs. Directory and character device vnodes are properly translated to their component names.
Reviewed by: arch Approved by: kib
|
185958 |
12-Dec-2008 |
marcus |
Add a simple VOP_VPTOCNP implementation for deadfs which returns EBADF.
Reviewed by: arch Approved by: kib
|
185864 |
10-Dec-2008 |
kib |
Relock user map earlier, to have the lock held when break leaves the loop earlier due to sbuf error.
Pointy hat to: me Submitted by: dchagin
|
185766 |
08-Dec-2008 |
kib |
Make two style changes to create new commit and document proper commit message for r185765.
Noted by: rdivacky Requested by: des
Commit message for r185765 should be: In procfs map handler, and in linprocfs maps handler, do not call vn_fullpath() while having vm map locked. This is done in anticipation of the vop_vptocnp commit, that would make vn_fullpath sometime acquire vnode lock.
Also, in linprocfs, maps handler already acquires vnode lock.
No objections from: des MFC after: 2 week
|
185765 |
08-Dec-2008 |
kib |
Change the linprocfs <pid>/maps and procfs <pid>/map handlers to use sbuf instead of doing uiomove. This allows for reads from non-zero offsets to work.
Patch is forward-ported des@' one, and was adopted to current code by dchagin@ and me.
Reviewed by: des (linprocfs part) PR: kern/101453 MFC after: 1 week
|
185361 |
27-Nov-2008 |
kientzle |
The timezone byte is a signed value, treat it as such. Otherwise, time zone information for time zones west of GMT gets discarded.
PR: kern/128934 Submitted by: J.R. Oldroyd MFC after: 4 days
|
185335 |
26-Nov-2008 |
kib |
In null_lookup(), do the needed cleanup instead of panicing saying the cleanup is needed.
Reported by: kris, pho Tested by: pho MFC after: 2 weeks
|
185334 |
26-Nov-2008 |
lulf |
- Support IEEE_P1282 and IEEE_1282 tags in the rock ridge extensions record.
PR: kern/128942 Submitted by: "J.R. Oldroyd" <fbsd - at - opal.com>
|
185284 |
25-Nov-2008 |
daichi |
Simplify mode_t check treatment (suggested by trasz). By semantical view, trasz's code is better than prior one.
Submitted by: trasz Reviewed by: Masanori OZAWA <ozawa@ongs.co.jp>
|
185283 |
25-Nov-2008 |
daichi |
Fixes Unionfs socket issue reported as kern/118346.
PR: 118346 Submitted by: Masanori OZAWA <ozawa@ongs.co.jp> Discussed at: devsummit Strassburg, EuroBSDCon2008 Discussed with: rwatson, gnn, hrs MFC after: 2 week
|
185071 |
18-Nov-2008 |
jhb |
- Fix a typo in a comment. - Whitespace fix. - Remove #if 0'd BSD 4.x code for flushing busy buffers from a mountpoint during an unmount. FreeBSD uses vflush() for this.
|
185070 |
18-Nov-2008 |
jhb |
When looking up the vnode for the device to mount the filesystem on, ask NDINIT to return a locked vnode instead of letting it drop the lock and return a referenced vnode and then relock the vnode a few lines down. This matches the behavior of other filesystem mount routines.
|
185069 |
18-Nov-2008 |
jhb |
Remove copy/paste code from UFS to handle sparse blocks. While Rock Ridge does support sparse files, the cd9660 code does not currently support them.
|
185068 |
18-Nov-2008 |
jhb |
Remove unused i_flags field and IN_ACCESS flag from cd9660 in-memory i-nodes. cd9660 doesn't support access times.
|
184652 |
04-Nov-2008 |
jhb |
Remove unnecessary locking around vn_fullpath(). The vnode lock for the vnode in question does not need to be held. All the data structures used during the name lookup are protected by the global name cache lock. Instead, the caller merely needs to ensure a reference is held on the vnode (such as vhold()) to keep it from being freed.
In the case of procfs' <pid>/file entry, grab the process lock while we gain a new reference (via vhold()) on p_textvp to fully close races with execve(2).
For the kern.proc.vmmap sysctl handler, use a shared vnode lock around the call to VOP_GETATTR() rather than an exclusive lock.
MFC after: 1 month
|
184650 |
04-Nov-2008 |
jhb |
Don't pass WANTPARENT to the pathname lookup of the mount point for a unionfs mount just so we can immediately drop the reference on the parent directory vnode without using it.
|
184595 |
03-Nov-2008 |
trasz |
Fix few missed accmode changes in coda.
Approved by: rwatson (mentor)
|
184588 |
03-Nov-2008 |
dfr |
Implement support for RPCSEC_GSS authentication to both the NFS client and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation.
The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code.
To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf.
As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks.
Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd.
The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option.
Sponsored by: Isilon Systems MFC after: 1 month
|
184572 |
02-Nov-2008 |
rwatson |
Catch up with netsmb locking: explicit thread arguments no longer required.
|
184557 |
02-Nov-2008 |
trasz |
Remove the call to getinoquota() from ntfs_access. How did it get there?!
Approved by: rwatson (mentor)
|
184413 |
28-Oct-2008 |
trasz |
Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit.
Approved by: rwatson (mentor)
|
184214 |
23-Oct-2008 |
des |
Fix a number of style issues in the MALLOC / FREE commit. I've tried to be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.
|
184205 |
23-Oct-2008 |
des |
Retire the MALLOC and FREE macros. They are an abomination unto style(9).
MFC after: 3 months
|
183806 |
12-Oct-2008 |
rwatson |
The locking in portalfs's socket connect code is no less correct than identical code in connect(2), so remove XXX that it might be incorrect.
MFC after: 3 days
|
183754 |
10-Oct-2008 |
attilio |
Remove the struct thread unuseful argument from bufobj interface. In particular following functions KPI results modified: - bufobj_invalbuf() - bufsync()
and BO_SYNC() "virtual method" of the buffer objects set. Main consumers of bufobj functions are affected by this change too and, in particular, functions which changed their KPI are: - vinvalbuf() - g_vfs_close()
Due to the KPI breakage, __FreeBSD_version will be bumped in a later commit.
As a side note, please consider just temporary the 'curthread' argument passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP
Reviewed by: kib Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
183649 |
06-Oct-2008 |
rwatson |
Use soconnect2() rather than directly invoking uipc_connect2() to interconnect two UNIX domain sockets.
MFC after: 3 days
|
183600 |
04-Oct-2008 |
kib |
Change the linprocfs <pid>/maps and procfs <pid>/map handlers to use sbuf instead of doing uiomove. This allows for reads from non-zero offsets to work.
Patch is forward-ported des@' one, and was adopted to current code by dchagin@ and me.
Reviewed by: des (linprocfs part) PR: kern/101453 MFC after: 1 week
|
183578 |
03-Oct-2008 |
trasz |
Fix Vflags abuse in fdescfs. There should be no functional changes.
Approved by: rwatson (mentor)
|
183577 |
03-Oct-2008 |
trasz |
Fix Vflags abuse in cd9660. There should be no functional changes.
Approved by: rwatson (mentor)
|
183550 |
02-Oct-2008 |
zec |
Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit
Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs.
Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().
Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.).
All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*).
(*) netipsec/keysock.c did not validate depending on compile time options.
Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
183383 |
26-Sep-2008 |
kib |
Save previous content of the td_fpop before storing the current filedescriptor into it. Make sure that td_fpop is NULL when calling d_mmap from dev_pager_getpages().
Change guards against td_fpop field being non-NULL with private state for another device, and against sudden clearing the td_fpop. This could occur when either a driver method calls another driver through the filedescriptor operation, or a page fault happen while driver is writing to a memory backed by another driver.
Noted by: rwatson Tested by: rnoland MFC after: 3 days
|
183381 |
26-Sep-2008 |
ed |
Remove unit2minor() use from kernel code.
When I changed kern_conf.c three months ago I made device unit numbers equal to (unneeded) device minor numbers. We used to require bitshifting, because there were eight bits in the middle that were reserved for a device major number. Not very long after I turned dev2unit(), minor(), unit2minor() and minor2unit() into macro's. The unit2minor() and minor2unit() macro's were no-ops.
We'd better not remove these four macro's from the kernel, because there is a lot of (external) code that may still depend on them. For now it's harmless to remove all invocations of unit2minor() and minor2unit().
Reviewed by: kib
|
183299 |
23-Sep-2008 |
obrien |
The kernel implemented 'memcmp' is an alias for 'bcmp'. However, memcmp and bcmp are not the same thing. 'man bcmp' states that the return is "non-zero" if the two byte strings are not identical. Where as, 'man memcmp' states that the return is the "difference between the first two differing bytes (treated as unsigned char values" if the two byte strings are not identical.
So provide a proper memcmp(9), but it is a C implementation not a tuned assembly implementation. Therefore bcmp(9) should be preferred over memcmp(9).
|
183230 |
21-Sep-2008 |
ed |
Already initialize the vfs timestamps inside the cdev upon allocation.
In the MPSAFE TTY branch I noticed the vfs timestamps inside devfs were allocated with 0, where the getattr() routine bumps the timestamps to boottime if the value is below 3600. The reason why it has been designed like this, is because timestamps during boot are likely to be invalid.
This means that device nodes that are created on demand (posix_openpt()) have timestamps with a value of boottime, which is not what we want. Solve this by calling vfs_timestamp() inside devfs_alloc().
Discussed with: kib
|
183215 |
20-Sep-2008 |
kib |
fdescfs, devfs, mqueuefs, nfs, portalfs, pseudofs, tmpfs and xfs initialize the vattr structure in VOP_GETATTR() with VATTR_NULL(), vattr_null() or by zeroing it. Remove these to allow preinitialization of fields work in vn_stat(). This is needed to get birthtime initialized correctly.
Submitted by: Jaakko Heinonen <jh saunalahti fi> Discussed on: freebsd-fs MFC after: 1 month
|
183214 |
20-Sep-2008 |
kib |
Initialize va_rdev to NODEV instead of 0 or VNOVAL in VOP_GETATTR(). NODEV is more appropriate when va_rdev doesn't have a meaningful value.
Submitted by: Jaakko Heinonen <jh saunalahti fi> Suggested by: bde Discussed on: freebsd-fs MFC after: 1 month
|
183212 |
20-Sep-2008 |
kib |
Initialize va_flags and va_filerev properly in VOP_GETATTR(). Don't initialize va_vaflags and va_spare because they are not part of the VOP_GETATTR() API. Also don't initialize birthtime to ctime or zero.
Submitted by: Jaakko Heinonen <jh saunalahti fi> Reviewed by: bde Discussed on: freebsd-fs MFC after: 1 month
|
182943 |
11-Sep-2008 |
ed |
Fix two small typo's in comments in the nullfs vnops code.
Submitted by: Jille Timmermans <jille quis cx>
|
182739 |
03-Sep-2008 |
delphij |
Reflect license change of NetBSD code.
Obtained from: NetBSD MFC after: 3 days
|
182600 |
01-Sep-2008 |
kib |
In rev. 1.17 (r33548) of msdosfs_fat.c, relative cluster numbers were replaced by file relative sector numbers as the buffer block number when zero-padding a file during extension. Revert the change, it causes wrong blocks filled with zeroes on seeking beyond end of file.
PR: kern/47628 Submitted by: tegge MFC after: 3 days
|
182371 |
28-Aug-2008 |
attilio |
Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful.
Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
181905 |
20-Aug-2008 |
ed |
Integrate the new MPSAFE TTY layer to the FreeBSD operating system.
The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following:
- Improved driver model:
The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers.
If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver.
- Improved hotplugging:
With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc).
The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly.
- Improved performance:
One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters.
Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING.
Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan
|
181803 |
17-Aug-2008 |
bz |
Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course of the next few weeks.
Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
|
181635 |
12-Aug-2008 |
kib |
Remove unnecessary locking around pointer fetch.
Requested by: jhb
|
180291 |
05-Jul-2008 |
rwatson |
Introduce a new lock, hostname_mtx, and use it to synchronize access to global hostname and domainname variables. Where necessary, copy to or from a stack-local buffer before performing copyin() or copyout(). A few uses, such as in cd9660 and daemon_saver, remain under-synchronized and will require further updates.
Correct a bug in which a failed copyin() of domainname would leave domainname potentially corrupted.
MFC after: 3 weeks
|
180252 |
04-Jul-2008 |
kib |
The uniqdosname() function takes char[12] as it third argument.
Found by: -fstack-protector Reported by: dougb Tested by: dougb, Rainer Hurling <rhurlin gwdg de> MFC after: 3 days
|
180139 |
01-Jul-2008 |
rwatson |
Remove unused 'td' arguments from smbfs_hash_lock() and smbfs_hash_unlock().
MFC after: 3 days
|
179926 |
22-Jun-2008 |
gonzo |
Get pointer to devfs_ruleset struct after garbage collection has been performed. Otherwise if ruleset is used by given mountpoint and is empty it's freed by devfs_ruleset_reap and pointer becomes bogus.
Submitted by: Mateusz Guzik <mjguzik@gmail.com> PR: kern/124853
|
179828 |
16-Jun-2008 |
kib |
Struct cdev is always the member of the struct cdev_priv. When devfs needed to promote cdev to cdev_priv, the si_priv pointer was followed.
Use member2struct() to calculate address of the wrapping cdev_priv. Rename si_priv to __si_reserved.
Tested by: pho Reviewed by: ed MFC after: 2 weeks
|
179808 |
15-Jun-2008 |
kib |
Do not redo the vnode tear-down work already done by insmntque() when vnode cannot be put on the vnode list for mount.
Reported and tested by: marck Guilty party: me MFC after: 3 days
|
179726 |
11-Jun-2008 |
ed |
Don't enforce unique device minor number policy anymore.
Except for the case where we use the cloner library (clone_create() and friends), there is no reason to enforce a unique device minor number policy. There are various drivers in the source tree that allocate unr pools and such to provide minor numbers, without using them themselves.
Because we still need to support unique device minor numbers for the cloner library, introduce a new flag called D_NEEDMINOR. All cdevsw's that are used in combination with the cloner library should be marked with this flag to make the cloning work.
This means drivers can now freely use si_drv0 to store their own flags and state, making it effectively the same as si_drv1 and si_drv2. We still keep the minor() and dev2unit() routines around to make drivers happy.
The NTFS code also used the minor number in its hash table. We should not do this anymore. If the si_drv0 field would be changed, it would no longer end up in the same list.
Approved by: philip (mentor)
|
179722 |
11-Jun-2008 |
kib |
In cd9660_readdir vop, always initialize the idp->uio_off member.
The while loop that is assumed to initialize the uio_off later, may be not entered at all, causing uninitialized value to be returned in uio->uio_offset.
PR: 122925 Submitted by: Jaakko Heinonen <jh saunalahti fi> MFC after: 1 weeks
|
179554 |
05-Jun-2008 |
kib |
When devfs_allocv() committed to create new vnode, since de_vnode is NULL, the dm_lock is held while the newly allocated vnode is locked. Since no other threads may try to lock the new vnode yet, the LOR there cannot result in the deadlock.
Shut down the witness warning to note this fact.
Tested by: pho Prodded by: attilio
|
179475 |
01-Jun-2008 |
ed |
Revert the changes I made to devfs_setattr() in r179457.
As discussed with Robert Watson and John Baldwin, it would be better if PTY's are created with proper permissions, turning grantpt() into a no-op.
Bypassing security frameworks like MAC by passing NOCRED to VOP_SETATTR() will only make things more complex.
Approved by: philip (mentor)
|
179457 |
31-May-2008 |
ed |
Merge back devfs changes from the mpsafetty branch.
In the mpsafetty branch, PTY's are allocated through the posix_openpt() system call. The controller side of a PTY now uses its own file descriptor type (just like sockets, vnodes, pipes, etc).
To remain compatible with existing FreeBSD and Linux C libraries, we can still create PTY's by opening /dev/ptmx or /dev/ptyXX. These nodes implement d_fdopen(). Devfs has been slightly changed here, to allow finit() to be called from d_fdopen().
The routine grantpt() has also been moved into the kernel. This routine is a little odd, because it needs to bypass standard UNIX permissions. It needs to change the owner/group/mode of the slave device node, which may often not be possible. The old implementation solved this by spawning a setuid utility.
When VOP_SETATTR() is called with NOCRED, devfs_setattr() dereferences ap->a_cred, causing a kernel panic. Change the de_{uid,gid,mode} code to allow changes when a->a_cred is set to NOCRED.
Approved by: philip (mentor)
|
179288 |
24-May-2008 |
lulf |
- Add locking to all filesystem operations in fdescfs and flag it as MPSAFE. - Use proper synhronization primitives to protect the internal fdesc node cache used in fdescfs. - Properly initialize and uninitalize hash. - Remove unused functions.
Since fdescfs might recurse on itself, adding proper locking to it needed some tricky workarounds in some parts to make it work. For instance, a descriptor in fdescfs could refer to an open descriptor to itself, thus forcing the thread to recurse on vnode locks. Because of this, other race conditions also had to be fixed.
Tested by: pho Reviewed by: kib (mentor) Approved by: kib (mentor)
|
179247 |
23-May-2008 |
kib |
When vget() fails (because the vnode has been reclaimed), there is no sense to loop trying to vget() the vnode again.
PR: 122977 Submitted by: Arthur Hartwig <arthur.hartwig nokia com> Tested by: pho Reviewed by: jhb MFC after: 1 week
|
179175 |
21-May-2008 |
kib |
Implement the per-open file data for the cdev.
The patch does not change the cdevsw KBI. Management of the data is provided by the functions int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr); int devfs_get_cdevpriv(void **datap); void devfs_clear_cdevpriv(void); All of the functions are supposed to be called from the cdevsw method contexts.
- devfs_set_cdevpriv assigns the priv as private data for the file descriptor which is used to initiate currently performed driver operation. dtr is the function that will be called when either the last refernce to the file goes away, the device is destroyed or devfs_clear_cdevpriv is called. - devfs_get_cdevpriv is the obvious accessor. - devfs_clear_cdevpriv allows to clear the private data for the still open file.
Implementation keeps the driver-supplied pointers in the struct cdev_privdata, that is referenced both from the struct file and struct cdev, and cannot outlive any of the referee.
Man pages will be provided after the KPI stabilizes.
Reviewed by: jhb Useful suggestions from: jeff, antoine Debugging help and tested by: pho MFC after: 1 month
|
179060 |
16-May-2008 |
markus |
Fix and speedup timestamp calculations which is roughly based on the patch in the mentioned PR:
- bounds check time->month as it is used as an array index - fix usage of time->month as array index (month is 1-12) - fix calculation based on time->day (day is 1-31) - fix the speedup code as it doesn't calculate correct timestamps before the year 2000 and reduce the number of calculation in the year-by-year code - speedup month calculations by replacing the array content with cumulative values - add microseconds calculation - fix an endian problem
PR: kern/97786 Submitted by: Andriy Gapon <avg@topspin.kiev.ua> Reviewed by: scottl (earlier version) Approved by: emax (mentor) MFC after: 1 week
|
179030 |
15-May-2008 |
attilio |
lockinit() can't accept LK_EXCLUSIVE as an initializaiton flag, so just drop it.
Reported by: Josh Carroll <josh dot carroll at gmail dot com> Submitted by: jhb
|
178834 |
07-May-2008 |
jhb |
Don't explicitly drop Giant around d_open/d_fdopen/d_close for MPSAFE drivers. Since devfs is already marked MPSAFE it shouldn't be held anyway.
MFC after: 2 weeks Discussed with: phk
|
178822 |
07-May-2008 |
daichi |
- change function name from *_vdir to *_vnode because VSOCK has been added as cache target. Now they process not only VDIR but also VSOCK. - fixed panic issue caused by cache incorrect free process by "umount -f"
Submitted by: Masanori OZAWA <ozawa@ongs.co.jp> MFC after: 1 week
|
178491 |
25-Apr-2008 |
daichi |
o Fixed multi thread access issue reported by Alexander V. Chernikov (admin@su29.net) fixed: kern/109950
PR: kern/109950 Submitted by: Alexander V. Chernikov (admin@su29.net) Reviewed by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week
|
178485 |
25-Apr-2008 |
daichi |
o Improved unix socket connection issue fixed: kern/118346
PR: kern/118346 Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week
|
178484 |
25-Apr-2008 |
daichi |
o Fixed rename panic issue
Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week
|
178483 |
25-Apr-2008 |
daichi |
o Fixed inaccessible issue especially including devfs on unionfs case. fixed also: kern/117829
PR: kern/117829 Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week
|
178478 |
25-Apr-2008 |
daichi |
o Added system hang-up process when VOP_READDIR of unionfs_nodeget() returns not end of the file status on debug mode (DIAGNOSTIC defined) kernel.
Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week
|
178243 |
16-Apr-2008 |
kib |
Move the head of byte-level advisory lock list from the filesystem-specific vnode data to the struct vnode. Provide the default implementation for the vop_advlock and vop_advlockasync. Purge the locks on the vnode reclaim by using the lf_purgelocks(). The default implementation is augmented for the nfs and smbfs. In the nfs_advlock, push the Giant inside the nfs_dolock.
Before the change, the vop_advlock and vop_advlockasync have taken the unlocked vnode and dereferenced the fs-private inode data, racing with with the vnode reclamation due to forced unmount. Now, the vop_getattr under the shared vnode lock is used to obtain the inode size, and later, in the lf_advlockasync, after locking the vnode interlock, the VI_DOOMED flag is checked to prevent an operation on the doomed vnode.
The implementation of the lf_purgelocks() is submitted by dfr.
Reported by: kris Tested by: kris, pho Discussed with: jeff, dfr MFC after: 2 weeks
|
178195 |
14-Apr-2008 |
dfr |
When calling lf_advlock to unlock a record, make sure that ap->a_fl->l_type is F_UNLCK otherwise we trigger a LOCKF_DEBUG panic.
MFC after: 3 days
|
177957 |
06-Apr-2008 |
attilio |
Optimize lockmgr in order to get rid of the pool mutex interlock, of the state transitioning flags and of msleep(9) callings. Use, instead, an algorithm very similar to what sx(9) and rwlock(9) alredy do and direct accesses to the sleepqueue(9) primitive.
In order to avoid writer starvation a mechanism very similar to what rwlock(9) uses now is implemented, with the correspective per-thread shared lockmgrs counter.
This patch also adds 2 new functions to lockmgr KPI: lockmgr_rw() and lockmgr_args_rw(). These two are like the 2 "normal" versions, but they both accept a rwlock as interlock. In order to realize this, the general lockmgr manager function "__lockmgr_args()" has been implemented through the generic lock layer. It supports all the blocking primitives, but currently only these 2 mappers live.
The patch drops the support for WITNESS atm, but it will be probabilly added soon. Also, there is a little race in the draining code which is also present in the current CVS stock implementation: if some sharers, once they wakeup, are in the runqueue they can contend the lock with the exclusive drainer. This is hard to be fixed but the now committed code mitigate this issue a lot better than the (past) CVS version. In addition assertive KA_HELD and KA_UNHELD have been made mute assertions because they are dangerous and they will be nomore supported soon.
In order to avoid namespace pollution, stack.h is splitted into two parts: one which includes only the "struct stack" definition (_stack.h) and one defining the KPI. In this way, newly added _lockmgr.h can just include _stack.h.
Kernel ABI results heavilly changed by this commit (the now committed version of "struct lock" is a lot smaller than the previous one) and KPI results broken by lockmgr_rw() / lockmgr_args_rw() introduction, so manpages and __FreeBSD_version will be updated accordingly.
Tested by: kris, pho, jeff, danger Reviewed by: jeff Sponsored by: Google, Summer of Code program 2007
|
177910 |
04-Apr-2008 |
kib |
The temporary workaround for the call to the vget() without lock type in the fdesc_allocvp(). The caller of the fdesc_allocvp() expects that the returned vnode is not reclaimed. Do lock the vnode exclusive and drop the lock after.
Reported by: pho Reviewed by: jeff
|
177785 |
31-Mar-2008 |
kib |
Add the support for the AT_FDCWD and fd-relative name lookups to the namei(9).
Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho
|
177725 |
29-Mar-2008 |
jeff |
- Simplify null_hashget() and null_hashins() by using vref() rather than a complex series of steps involving vget() without a lock type to emulate the same thing.
|
177633 |
26-Mar-2008 |
dfr |
Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf.
Highlights include:
* Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts.
* Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation.
* Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux.
* Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket.
* Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock.
* Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers.
Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks
|
177493 |
22-Mar-2008 |
jeff |
- Complete part of the unfinished bufobj work by consistently using BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find.
Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)
|
177458 |
20-Mar-2008 |
kib |
Do not dereference cdev->si_cdevsw, use the dev_refthread() to properly obtain the reference. In particular, this fixes the panic reported in the PR. Remove the comments stating that this needs to be done.
PR: kern/119422 MFC after: 1 week
|
177091 |
12-Mar-2008 |
jeff |
Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
|
176745 |
02-Mar-2008 |
rwatson |
Replace lockmgr lock protecting nwfs vnode hash table with an sx lock.
MFC after: 1 month
|
176744 |
02-Mar-2008 |
rwatson |
Replace lockmgr lock protecting smbfs node hash table with sx lock.
MFC after: 1 month
|
176708 |
01-Mar-2008 |
attilio |
- Handle buffer lock waiters count directly in the buffer cache instead than rely on the lockmgr support [1]: * bump the waiters only if the interlock is held * let brelvp() return the waiters count * rely on brelvp() instead than BUF_LOCKWAITERS() in order to check for the waiters number - Remove a namespace pollution introduced recently with lockmgr.h including lock.h by including lock.h directly in the consumers and making it mandatory for using lockmgr. - Modify flags accepted by lockinit(): * introduce LK_NOPROFILE which disables lock profiling for the specified lockmgr * introduce LK_QUIET which disables ktr tracing for the specified lockmgr [2] * disallow LK_SLEEPFAIL and LK_NOWAIT to be passed there so that it can only be used on a per-instance basis - Remove BUF_LOCKWAITERS() and lockwaiters() as they are no longer used
This patch breaks KPI so __FreBSD_version will be bumped and manpages updated by further commits. Additively, 'struct buf' changes results in a disturbed ABI also.
[2] Really, currently there is no ktr tracing in the lockmgr, but it will be added soon.
[1] Submitted by: kib Tested by: pho, Andrea Barberio <insomniac at slackware dot it>
|
176583 |
26-Feb-2008 |
kib |
Rename fdescfs vnode from "fdesc" to "fdescfs" to avoid name collision of the vnode lock with the fdesc_mtx mutex. Having different kinds of locks with the same name confuses witness.
|
176578 |
26-Feb-2008 |
rwatson |
Add "Make MPSAFE" to the Coda todo list.
MFC after: 3 days
|
176559 |
25-Feb-2008 |
attilio |
Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread.
As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits.
Tested by: Andrea Barberio <insomniac at slackware dot it>
|
176519 |
24-Feb-2008 |
attilio |
Introduce some functions in the vnode locks namespace and in the ffs namespace in order to handle lockmgr fields in a controlled way instead than spreading all around bogus stubs: - VN_LOCK_AREC() allows lock recursion for a specified vnode - VN_LOCK_ASHARE() allows lock sharing for a specified vnode
In FFS land: - BUF_AREC() allows lock recursion for a specified buffer lock - BUF_NOREC() disallows recursion for a specified buffer lock
Side note: union_subr.c::unionfs_node_update() is the only other function directly handling lockmgr fields. As this is not simple to fix, it has been left behind as "sole" exception.
|
176431 |
21-Feb-2008 |
marcel |
Don't check the bpbSecPerTrack and bpbHeads fields of the BPB. They are typically 0 on new ia64 systems. Since we don't use either field, there's no harm in not checking.
|
176363 |
17-Feb-2008 |
rwatson |
Remove custom queue macros in Coda, replacing them with queue(9) tailq macros. The only semantic change was the need to add a vc_opened field to struct vcomm since we can no longer use the request queue returning to an uninitialized state to hold whether or not the device is open.
MFC after: 1 month
|
176362 |
17-Feb-2008 |
rwatson |
Remove namecache performance-tuning todo for Coda: we now use the FreeBSD name cache.
MFC after: 1 month
|
176309 |
15-Feb-2008 |
rwatson |
The possibly interruptible msleep in coda_call() means well, but is fundamentally fairly confused about how signals work and when it is appropriate for upcalls to be interrupted. In particular, we should be exempting certain upcalls from interruption, we should not always eventually time out sleeping on a upcall, and we should not be interrupting the sleep for certain signals that we currently are (including SIGINFO). This code needs to be reworked in the style of NFS interruptible mounts.
MFC after: 1 month
|
176308 |
15-Feb-2008 |
rwatson |
Spell replys as replies.
MFC after: 1 month
|
176307 |
15-Feb-2008 |
rwatson |
Reorder and clean up make_coda_node(), annotate weaknesses in the implementation.
MFC after: 1 month
|
176263 |
14-Feb-2008 |
rwatson |
Remove debugging code under OLD_DIAGNOSTIC; this is all >10 years old and hasn't been used in that time.
MFC after: 1 month
|
176262 |
14-Feb-2008 |
rwatson |
In Coda, flush the attribute cache for a cnode when its fid is changed, as its synthesized inode number may have changed and we want stat(2) to pick up the new inode number.
MFC after: 1 month
|
176248 |
13-Feb-2008 |
rwatson |
Update cache flushing behavior in light of recent namecache and access cache improvements:
- Flush just access control state on CODA_PURGEUSER, not the full namecache for /coda.
- When replacing a fid on a cnode as a result of, e.g., reintegration after offline operation, we no longer need to purge the namecache entries associated with its vnode.
MFC after: 1 month
|
176238 |
13-Feb-2008 |
rwatson |
Implement a rudimentary access cache for the Coda kernel module, modeled on the access cache found in NFS, smbfs, and the Linux coda module. This is a positive access cache of a single entry per file, tracking recently granted rights, but unlike NFS and smbfs, supporting explicit invalidation by the distributed file system.
For each cnode, maintain a C_ACCCACHE flag indicating the validity of the cache, and a cached uid and mode tracking recently granted positive access control decisions.
Prefer the cache to venus_access() in VOP_ACCESS() if it is valid, and when we must fall back to venus_access(), update the cache.
Allow Venus to clear the access cache, either the whole cache on CODA_FLUSH, or just entries for a specific uid on CODA_PURGEUSER. Unlike the Coda module on Linux, we don't flush all entries on a user purge using a generation number, we instead walk present cnodes and clear only entries for the specific user, meaning it is somewhat more expensive but won't hit all users.
Since the Coda module is agressive about not keeping around unopened cnodes, the utility of the cache is somewhat limited for files, but works will for directories. We should make Coda less agressive about GCing cnodes in VOP_INACTIVE() in order to improve the effectiveness of in-kernel caching of attributes and access rights.
MFC after: 1 month
|
176234 |
13-Feb-2008 |
rwatson |
Remove now-unused Coda namecache.
MFC after: 1 month
|
176233 |
13-Feb-2008 |
rwatson |
Rather than having the Coda module use its own namecache, use the global VFS namecache, as is done by the Coda module on Linux. Unlike the Coda namecache, the global VFS namecache isn't tagged by credential, so use ore conservative flushing behavior (for now) when CODA_PURGEUSER is issued by Venus.
This improves overall integration with the FreeBSD VFS, including allowing __getcwd() to work better, procfs/procstat monitoring, and so on. This improves shell behavior in many cases, and improves ".." handling. It may lead to some slowdown until we've implemented a specific access cache, which should net improve performance, but in the mean time, lookup access control now always goes to Venus, whereas previously it didn't.
MFC after: 1 month
|
176232 |
13-Feb-2008 |
attilio |
Fix a lock leak in the ntfs locking scheme: When ntfs_ntput() reaches 0 in the refcount the inode lockmgr is not released and directly destroyed. Fix this by unlocking the lockmgr() even in the case of zero-refcount.
Reported by: dougb, yar, Scot Hetzel <swhetzel at gmail dot com> Submitted by: yar
|
176156 |
11-Feb-2008 |
rwatson |
Clean up coda_pathconf() slightly while debugging a problem there.
MFC after: 1 month
|
176139 |
10-Feb-2008 |
rwatson |
Since we're now actively maintaining the Coda module in the FreeBSD source tree, restyle everything but coda.h (which is more explicitly shared across systems) into a closer approximation to style(9).
Remove a few more unused function prototypes.
Add or clarify some comments.
MFC after: 1 month
|
176131 |
09-Feb-2008 |
rwatson |
Various further non-functional cleanups to coda:
- Rename print_vattr to coda_print_vattr and make static, rename print_cred to coda_print_cred. - Remove unused coda_vop_nop. - Add XXX comment because coda_readdir forwards to the cache vnode's readdir rather than venus_readdir, and annotate venus_readdir as unused. - Rename vc_nb_* to vc_*. - Use d_open_t, d_close_t, d_read_t, d_write_t, d_ioctl_t and d_poll_t for prototyping vc_* as that is the intent, don't use our own definitions. - Rename coda_nb_statfs to coda_statfs, rename NB_SFS_SIZ to CODA_SFS_SIZ. - Replace one more OBE reference to NetBSD with a reference to FreeBSD. - Tidy up a little vertical whitespace here and there. - Annotate coda_nc_zapvnode as unused. - Remove unused vcodattach. - Annotate VM_INTR as unused. - Annotate that coda_fhtovp is unused and doesn't match the FreeBSD prototype, so isn't hooked up to vfs_fhtovp. If we want NFS export of Coda to work someday, this needs to be fixed. - Remove unused getNewVnode. - Remove unused coda_vget, coda_init, coda_quotactl prototypes.
MFC after: 1 month
|
176130 |
09-Feb-2008 |
rwatson |
No reason not to maintain stats on statfs in Coda, as it's done for other VFS operations, so uncomment the existing statistics gathering.
MFC after: 1 month
|
176129 |
09-Feb-2008 |
rwatson |
Remove unused devtomp(), which exploited UFS-specific knowledge to find the mountpoint for a specific device. This was implemented incorrectly, a bad idea in a fundamental sense, and also never used, so presumably a long-idle debugging function.
MFC after: 1 month
|
176127 |
09-Feb-2008 |
rwatson |
Since Coda is effectively a stacked file system, use VOP_EOPNOTSUPP for vop_bmap; delete the existing stub that returned either EINVAL or EOPNOTSUPP, and had unreachable calls to VOP_BMAP on the cache vnode.
MFC after: 1 month
|
176122 |
09-Feb-2008 |
rwatson |
Lock cache vnode when VOP_FSYNC() is called on a Coda vnode.
MFC after: 1 month
|
176121 |
09-Feb-2008 |
rwatson |
Make all calls to vn_lock() in Coda, including recently added ones, use LK_RETRY, since failure is undesirable (and not handled).
MFC after: 1 month Pointed out by: kib
|
176120 |
08-Feb-2008 |
rwatson |
The Coda module was originally ported to NetBSD from Mach by rvb, and then later to FreeBSD. Update various NetBSD-related comments: in some cases delete them because they don't appply, in others update to say FreeBSD as they still apply but in FreeBSD (and might for that matter no longer apply on NetBSD), and flag one case where I'm not sure whether it applies.
MFC after: 1 month
|
176118 |
08-Feb-2008 |
rwatson |
Before invoking vnode operations on cache vnodes, acquire the vnode locks of those vnodes. Probably, Coda should do the same lock sharing/ pass-through that is done for nullfs, but in the mean time this ensures that locks are adequately held to prevent corruption of data structures in the cache file system.
Assuming most operations came from the top layer of Coda and weren't performed directly on the cache vnodes, in practice this corruption was relatively unlikely as the Coda vnode locks were ensuring exclusive access for most consumers.
This causes WITNESS to squeal like a pig immediately when Coda is used, rather than waiting until file close; I noticed these problems because of the lack of said squealing.
MFC after: 1 month
|
176117 |
08-Feb-2008 |
rwatson |
Remove undefined coda excluded by #if 1 #else, which previously protected vget() calls using inode numbers to query the root of /coda, which is not needed since we now cache the root vnode with the mountpoint.
MFC after: 1 month
|
176116 |
08-Feb-2008 |
attilio |
Conver all explicit instances to VOP_ISLOCKED(arg, NULL) into VOP_ISLOCKED(arg, curthread). Now, VOP_ISLOCKED() and lockstatus() should only acquire curthread as argument; this will lead in axing the additional argument from both functions, making the code cleaner.
Reviewed by: jeff, kib
|
175679 |
26-Jan-2008 |
rwatson |
Remove Giant acquisition around soreceive() and sosend() in fifofs. The bug that caused us to reintroduce it is believed to be fixed, and Kris says he no longer sees problems with fifofs in highly parallel builds. If this works out, we'll MFC it for 7.1.
MFC after: 3 months Pointed out by: kris
|
175635 |
24-Jan-2008 |
attilio |
Cleanup lockmgr interface and exported KPI: - Remove the "thread" argument from the lockmgr() function as it is always curthread now - Axe lockcount() function as it is no longer used - Axe LOCKMGR_ASSERT() as it is bogus really and no currently used. Hopefully this will be soonly replaced by something suitable for it. - Remove the prototype for dumplockinfo() as the function is no longer present
Addictionally: - Introduce a KASSERT() in lockstatus() in order to let it accept only curthread or NULL as they should only be passed - Do a little bit of style(9) cleanup on lockmgr.h
KPI results heavilly broken by this change, so manpages and FreeBSD_version will be modified accordingly by further commits.
Tested by: matteo
|
175545 |
21-Jan-2008 |
rwatson |
Put "coda_rdwr: Internally Opening" printf generated by in-kernel writes to files, such as ktrace output, under CODA_VERBOSE. Otherwise, each such call to VOP_WRITE() results in a kernel printf.
MFC after: 3 days Obtained from: NetBSD
|
175544 |
21-Jan-2008 |
rwatson |
Replace references to VOP_LOCK() w/o LK_RETRY to vn_lock() with LK_RETRY, avoiding extra error handling, or in some cases, missing error handling.
MFC after: 3 days Discussed with: kib
|
175498 |
19-Jan-2008 |
rwatson |
Remove unused oldhash definition from Coda namecache.
MFC after: 3 days
|
175482 |
19-Jan-2008 |
rwatson |
Improve default vnode operation handling for Coda:
- Don't specify vnode operations for mknod, lease, and advlock--let them fall through to vop_default.
- Implement vop_default with &default_vnodeops, rather than with VOP_PANIC, so that unimplemented vnode operations are handled in more sensible ways than panicking, such as EOPNOTSUPP on ACL queries generated by bsdtar, or mknod.
MFC after: 3 days
|
175481 |
19-Jan-2008 |
rwatson |
Rework coda_statfs(): no longer need to zero the statfs structure or fill out all fields, just fill out the ones the file system knows about. Among other things, this causes the outpuf of "mount" and "df" to make quite a bit more sense as /dev/cfs0 is specified as the mountfrom name.
MFC after: 3 days
|
175479 |
19-Jan-2008 |
rwatson |
Zero mi_rotovp and coda_ctlvp immediately after calling vrele() on the vnodes during coda_unmount() in order to detect errant use of them after the vnode references may no longer be valid.
No need to clear the VV_ROOT flag on mi_rootvp flag (especially after the vnode reference is no longer valid) as this isn't done on other file systems.
MFC after: 3 days
|
175478 |
19-Jan-2008 |
rwatson |
Don't acquire an additional vnode reference to a vnode when it is opened and then release it when it is closed: we rely on the caller to keep the vnode around with a valid reference. This avoids vrele() destroying the vnode vop_close() is being called from during a call to vop_close(), and a crash due to lockmgr recursing the vnode lock when a Coda unmount occurs.
MFC after: 3 days
|
175476 |
19-Jan-2008 |
rwatson |
Don't declare functions as extern.
Move all extern variable definitions to associated .h files, move some extern variable definitions between include files to place them more appropriately.
MFC after: 3 days
|
175475 |
19-Jan-2008 |
rwatson |
Use VOP_NULL rather than VOP_PANIC for Coda's vop_print routine, so as to avoid panicking in DDB show lockedvnods.
MFC after: 3 days
|
175474 |
19-Jan-2008 |
rwatson |
Lock the new directory vnode returned by coda_mkdir(), as this is required by FreeBSD's vnode locking protocol.
MFC after: 3 days
|
175473 |
19-Jan-2008 |
rwatson |
Borrow the VM object associated with an underlying cache vnode with the Coda vnode derived from it, in the style of nullfs. This allows files in the Coda file system to be memory-mapped, such as with execve(2) or mmap(2).
MFC after: 3 days Reported by: Rune <u+openafsdev-sr55 at chalmers dot se>
|
175436 |
18-Jan-2008 |
kib |
udf_vget() shall vgone() the vnode when the file_entry cannot be allocated or read from the volume. Otherwise, half-constructed vnode could be found later and cause panic when accessed.
PR: 118322 MFC after: 1 week
|
175294 |
13-Jan-2008 |
attilio |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary.
KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed.
Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
175202 |
10-Jan-2008 |
attilio |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed.
Manpage and FreeBSD_version will be updated through further commits.
As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock.
Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
175166 |
08-Jan-2008 |
attilio |
Remove explicit calling of lockmgr() with the NULL argument. Now, lockmgr() function can only be called passing curthread and the KASSERT() is upgraded according with this.
In order to support on-the-fly owner switching, the new function lockmgr_disown() has been introduced and gets used in BUF_KERNPROC(). KPI, so, results changed and FreeBSD version will be bumped soon. Differently from previous code, we assume idle thread cannot try to acquire the lockmgr as it cannot sleep, so loose the relative check[1] in BUF_KERNPROC().
Tested by: kris
[1] kib asked for a KASSERT in the lockmgr_disown() about this condition, but after thinking at it, as this is a well known general rule, I found it not really necessary.
|
175151 |
08-Jan-2008 |
jhb |
Lock the vnode interlock while reading v_usecount to update si_usecount in a cdev in devfs_reclaim().
MFC after: 3 days Reviewed by: jeff (a while ago)
|
175140 |
07-Jan-2008 |
jhb |
Make ftruncate a 'struct file' operation rather than a vnode operation. This makes it possible to support ftruncate() on non-vnode file types in the future. - 'struct fileops' grows a 'fo_truncate' method to handle an ftruncate() on a given file descriptor. - ftruncate() moves to kern/sys_generic.c and now just fetches a file object and invokes fo_truncate(). - The vnode-specific portions of ftruncate() move to vn_truncate() in vfs_vnops.c which implements fo_truncate() for vnode file types. - Non-vnode file types return EINVAL in their fo_truncate() method.
Submitted by: rwatson
|
175137 |
07-Jan-2008 |
attilio |
g_vfs_close() wants the sx topology lock held while executing, so just add correct locking to the operation of unmounting. This will prevent debugging kernels from panicking if mounting a non-hpfs partition (I'm not sure if this can be a problem with a successful mounting operation though).
MFC: 3 days
|
174988 |
30-Dec-2007 |
jeff |
Remove explicit locking of struct file. - Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them.
Tested by: kris, pho
|
174951 |
28-Dec-2007 |
attilio |
Trimm out now unused option LK_EXCLUPGRADE from the lockmgr namespace. This option just adds complexity and the new implementation no longer will support it, so axing it now that it is unused is probabilly the better idea.
FreeBSD version is bumped in order to reflect the KPI breakage introduced by this patch.
In the ports tree, kris found that only old OSKit code uses it, but as it is thought to work only on 2.x kernels serie, version bumping will solve any problem.
|
174898 |
25-Dec-2007 |
rwatson |
Add a new 'why' argument to kdb_enter(), and a set of constants to use for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run.
Assign approximate why values to all current consumers of the kdb_enter() interface.
|
174538 |
11-Dec-2007 |
markus |
Fix calculation of descriptor tag checksums. According to ECMA-167, Part 4, 7.2.3, bytes 0-3 and 5-15 are used to calculate the checksum of a descriptor tag.
PR: kern/90521 Submitted by: Björn König <bkoenig@cs.tu-berlin.de> Reviewed by: scottl Approved by: emax (mentor)
|
174384 |
07-Dec-2007 |
delphij |
Turn MPASS(0) into panic with more obvious reason why the assertion is failed.
|
174379 |
06-Dec-2007 |
delphij |
size_max should be unsigned, as such, use size_t here.
|
174265 |
04-Dec-2007 |
wkoszek |
Explicitly initialize 'error' to 0 (two places). It lets one to build tmpfs from the latest source tree with older compiler--gcc3.
Reviewed by: kib@ (on freebsd-current@) Approved by: cognet@ (mentor)
|
173728 |
18-Nov-2007 |
maxim |
o English lesson from bde@: "iff" is not a typo, it means "if and only if". Backout previous.
|
173725 |
18-Nov-2007 |
delphij |
MFp4: Several fixes to tmpfs which makes it to survive from pho@'s strees2 suite, to quote his letter, this change:
1. It removes the tn_lookup_dirent stuff. I think this cannot be fixed, because nothing protects vnode/tmpfs node between lookup is done, and actual operation is performed, in the case the vnode lock is dropped. At least, this is the case with the from vnode for rename.
For now, we do the linear lookup in the parent node. This has its own drawbacks. Not mentioning speed (that could be fixed by using hash), the real problem is the situation where several hardlinks exist in the dvp. But, I think this is fixable.
2. The patch restores the VV_ROOT flag on the root vnode after it became reclaimed and allocated again. This fixes MPASS assertion at the start of the tmpfs_lookup() reported by many.
Submitted by: kib
|
173724 |
18-Nov-2007 |
delphij |
MFp4: Fix several style(9) bugs.
Submitted by: des
|
173695 |
17-Nov-2007 |
maxim |
o Mask maximum file permissions we get from mount_ntfs -m with ACCESSPERMS. Document in mount_ntfs(8) only the nine low-order bits of mask are used (taken from mount_msdosfs(8)).
PR: kern/114856 Submitted by: Ighighi MFC after: 1 month
|
173690 |
17-Nov-2007 |
maxim |
o Fix a typo in the comment.
|
173590 |
13-Nov-2007 |
maxim |
o Do not leak inodes hash table at module unload.
PR: kern/118017 Submitted by: Ighighi MFC after: 1 week
|
173570 |
12-Nov-2007 |
delphij |
Correct a stack overflow which will trigger panics when mode= is specified, caused by incorrect format string specified to vfs_scanopt() and subsequently vsscanf().
Pointed out by: kib Submitted by: des
|
172954 |
25-Oct-2007 |
trhodes |
Remove some debugging code that, while useful, doesn't belong in the committed version. While here, expand a macro only used once.
Discussed with/oked by: bde
|
172930 |
24-Oct-2007 |
rwatson |
Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms:
mac_<object>_<method/action> mac_<object>_check_<method/action>
The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names.
All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI.
Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
|
172883 |
22-Oct-2007 |
delphij |
Fixes to msdosfs dirtyflag related stuff:
- markvoldirty() needs to write to underlying GEOM provider. We have to do that *before* g_access() which sets the GEOM provider to read-only. - Remove dirty flag before free'ing iconv related resources. The dirty flag removal could fail, and it is hard to revert the iconv-free after the fail. - Mark volume as dirty if we have failed to mark it clean for safe. - Other style fixes to the touched functions.
|
172798 |
19-Oct-2007 |
bde |
Implement the async (really, delayed-write) mount option for msdosfs.
This is much simpler than for ffs since there are many fewer places where we need to choose between a delayed write and a sync write -- just 5 in msdosfs and more than 30 in ffs.
This is more complete and correct than in ffs. Several places in ffs are are still missing the choice. ffs_update() has a layering violation that breaks callers which want to force a sync update (mainly fsync(2) and O_SYNC write(2)).
However, fsync(2) and O_SYNC write(2) are still more broken than in ffs, since they are broken for default (non-sync non-async) mounts too. Both fail to sync the FAT in all cases, and both fail to sync the directory entry in some cases after losing a race. Async everything is probably safer than the half-baked sync of metadata given by default mounts.
|
172758 |
18-Oct-2007 |
bde |
Add noclusterr and noclusterw options to the options list. I forgot these when I implemented clustering.
|
172757 |
18-Oct-2007 |
bde |
Fix some style bugs in the mount options list. Mainly, sort the list, leaving space for adding missing options. Negative options are sorted after removing their "no" prefix, and generic options are sorted before msdosfs-specific ones.
|
172741 |
18-Oct-2007 |
bde |
In msdosfs_settattr(), don't do synchronous updates of the denode (except indirectly for the size pseudo-attribute). If anything deserves a sync update, then it is ids and immutable flags, since these are related to security, but ffs never synced these and msdosfs doesn't support them. (ufs_setattr() only does an update in one case where it is least needed (for timestamps); it did pessimal sync updates for timestamps until 1998/03/08 but was changed for unlogged reasons related to soft updates.)
Now msdosfs calls deupdat() with waitfor == 0, which normally gives a delayed update to disk but always gives a sync update of timestamps in core, while for ffs everything is delayed until the syncer daemon or other activity causes an update (except for timestamps).
This gives a large optimization mainly for things like cp -p, where attribute adjustment could easily triple the number of physical I/O's if it is done synchronously (but cp -p to msdosfs is not as bad as that, since msdosfs doesn't support many attributes so null adjustments are more common, and msdosfs doesn't support ctimes so even if cp doesn't weed out null adjustments they don't become non-null after clobbering the ctime).
|
172697 |
16-Oct-2007 |
alfred |
Get rid of qaddr_t.
Requested by: bde
|
172644 |
14-Oct-2007 |
daichi |
This changes give nullfs correctly work with latest unionfs.
Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
|
172643 |
14-Oct-2007 |
daichi |
Added whiteout behavior option. ``-o whiteout=always'' is default mode (it is established practice) and ``-o whiteout=whenneeded'' is less disk-space using mode especially for resource restricted environments like embedded environments. (Contributed by Ed Schouten. Thanks)
Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
|
172642 |
14-Oct-2007 |
daichi |
Default copy mode has been changed from traditional-mode to transparent-mode. Some folks who have reported some issues have solved with transparent mode. We guess it is time to change the default copy mode. The transparent-mode is the best in most situations.
Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
|
172641 |
14-Oct-2007 |
daichi |
Fixed un-vrele issue of upper layer root vnode of unionfs.
Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
|
172640 |
14-Oct-2007 |
daichi |
Added NULL check code pointed out by Coverity. (via Stanislav Sedov. Thanks)
Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
|
172639 |
14-Oct-2007 |
daichi |
- It has been become MPSAFE. - Fixed lock panic issue under MPSAFE. - Fixed panic issue whenever it locks vnode with reclaim. - Fixed lock implementations not conforming to vnode_if.src style.
Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
|
172638 |
14-Oct-2007 |
daichi |
Fixed vnode unlock/vrele untreated issues whenever errors have occurred during some treatments.
Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
|
172637 |
14-Oct-2007 |
daichi |
- Added support for vfs_cache on unionfs. As a result, you can use applications that use procfs on unionfs. - Removed unionfs internal cache mechanism because it has vfs_cache support instead. As a result, it just simplified code of unionfs. - Fixed kern/111262 issue.
Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
|
172636 |
14-Oct-2007 |
daichi |
Added treatments to prevent readdir infinity loop using with Linux binary compatibility feature.
Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
|
172635 |
14-Oct-2007 |
daichi |
Changed it frees unneeded memory ASAP.
Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
|
172634 |
14-Oct-2007 |
daichi |
Log: Improved access permission check treatments.
Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
|
172453 |
05-Oct-2007 |
jhb |
Use the correct pid when checking to see whether or not the /proc/<pid> directory itself (rather than any of its contents) is visible to the current thread.
MFC after: 1 week PR: kern/90063 Submitted by: john of 8192.net Approved by: re (kensmith)
|
172442 |
04-Oct-2007 |
delphij |
MFp4: Provide a dummy verb "export" to shut up the message showed up at start when NFS is enabled.
Reported by: rafan Approved by: re (tmpfs blanket)
|
172441 |
04-Oct-2007 |
delphij |
Additional work is still needed before we can claim that tmpfs is stable enough for production usage. Warn user upon mount.
Approved by: re (tmpfs blanket)
|
172303 |
23-Sep-2007 |
bde |
Remove some of the pessimizations involving writing the fsi sector. All active fields in fsi are advisory/optional, so we shouldn't do extra work to make them valid at all times, but instead we write to the fsi too often (we still do), and we searched for a free cluster for fsinxtfree too often.
This commit just removes the whole search and its results, so that we write out our in-core copy of fsinxtfree instead of writing a "fixed" copy and clobbering our in-core copy. This saves fixing 3 bugs: - off-by-1 error for the end of the search, resulting in fsinxtfree not actually being adjusted iff only the last cluster is free. - missing adjustment when no clusters are free. - off-by-many error for the start of the search. Starting the search at 0 instead of at (the in-core copy of) fsinxtfree did more than defeat the reasons for existence of fsinxtfree. fsinxtfree exists mainly to avoid having to start at 0 for just the first search per mount, but has the side effect of reducing bias towards allocating near cluster 0. The bias would normally only be generated by the first search per mount (if fsinxtfree is not supported), but since we also adjusted the in-core copy of fsinxtfree here, we were doing extra work to maximize the bias.
Approved by: re (kensmith)
|
172292 |
21-Sep-2007 |
rodrigc |
Disable multiple ntfs mounts to the same mountpoint. Eliminates panics due to locking issues. Idea taken from src/sys/gnu/fs/xfs/FreeBSD/xfs_super.c.
PR: 89966, 92000, 104393 Reported by: H. Matsuo <hiroshi50000 yahoo co jp>, Chris <m2chrischou gmail.com>, Andrey V. Elsukov <bu7cher yandex ru>, Jan Henrik Sylvester <me janh de> Approved by: re (kensmith)
|
172207 |
17-Sep-2007 |
jeff |
- Move all of the PS_ flags into either p_flag or td_flags. - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM.
Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)
|
172027 |
31-Aug-2007 |
bde |
Fix races in msdosfs_lookup() and msdosfs_readdir(). These functions can easily block in bread(), and then there was nothing to prevent the static buffer (nambuf_{ptr,len,last_id}) being clobbered by another thread.
The effects of the bug seem to have been limited to failed lookups and mangled names in readdir(), since Giant locking provides enough serialization to prevent concurrent calls to the functions that access the buffer. They were very obvious for multiple concurrent tree walks, especially with a small cluster size.
The bug was introduced in msdosfs_conv.c 1.34 and associated changes, and is in all releases starting with 5.2.
The fix is to allocate the buffer as a local variable and pass around pointers to it like "_r" functions in libc do. Stack use from this is large but not too large. This also fixes a memory leak on module unload.
Reviewed by: kib Approved by: re (kensmith)
|
171862 |
16-Aug-2007 |
delphij |
MFp4: rework tmpfs_readdir() logic in terms of correctness.
Approved by: re (tmpfs blanket) Tested with: fstest, fsx
|
171852 |
15-Aug-2007 |
jhb |
On 6.x this works:
% mount | grep home /dev/ad4s1e on /home (ufs, local, noatime, soft-updates) % mount -u -o atime /home % mount | grep home /dev/ad4s1e on /home (ufs, local, soft-updates)
Restore this behavior for on 7.x for the following mount options: noatime, noclusterr, noclusterw, noexec, nosuid, nosymfollow
In addition, on 7.x, the following are equivalent: mount -u -o atime /home mount -u -o nonoatime /home
Ideally, when we introduce new mount options, we should avoid options starting with "no". :)
Requested by: jhb Reported by: Karol Kwiat <karol.kwiat gmail com>, Scott Hetzel <swhetzel gmail com> Approved by: re (bmah) Proxy commit for: rodrigc
|
171802 |
10-Aug-2007 |
delphij |
MFp4: - LK_RETRY prohibits vget() and vn_lock() to return error. Remove associated code. [1] - Properly use vhold() and vdrop() instead of their unlocked versions, we are guaranteed to have the vnode's interlock unheld. [1] - Fix a pseudo-infinite loop caused by 64/32-bit arithmetic with the same way used in modern NetBSD versions. [2] - Reorganize tmpfs_readdir to reduce duplicated code.
Submitted by: kib [1] Obtained from: NetBSD [2] Approved by: re (tmpfs blanket)
|
171799 |
10-Aug-2007 |
delphij |
MFp4:
- Respect cnflag and don't lock vnode always as LK_EXCLUSIVE [1] - Properly lock around tn_vnode to avoid NULL deference - Be more careful handling vnodes (*)
(*) This is a WIP [1] by pjd via howardsu
Thanks kib@ for his valuable VFS related comments.
Tested with: fsx, fstest, tmpfs regression test set Found by: pho's stress2 suite Approved by: re (tmpfs blanket)
|
171774 |
07-Aug-2007 |
bde |
In msdosfs_read() and msdosfs_write(), don't check explicitly for (uio_offset < 0) since this can't happen. If this happens, then the general code handles the problem safely (better than before for reading, returning 0 (EOF) instead of the bogus errno EINVAL, and the same as before for writing, returning EFBIG).
In msdosfs_read(), don't check for (uio_resid < 0). msdosfs_write() already didn't check.
In msdosfs_read(), document in a comment our assumptions that the caller passed a valid uio_offset and uio_resid. ffs checks using KASSERT(), and that is enough sanity checking. In the same comment, partly document there is no need to check for the EOVERFLOW case, unlike in ffs where this case can happen at least in theory.
In msdosfs_write(), add a comment about why the checking of (uio_resid == 0) is explicit, unlike in ffs.
In msdosfs_write(), check for impossibly large final offsets before checking if the file size rlimit would be exceeded, so that we don't have an overflow bug in the rlimit check and are consistent with ffs. We now return EFBIG instead of EFBIG plus a SIGXFSZ signal if the final offset would be impossibly large but not so large as to cause overflow. Overflow normally gave the benign behaviour of no signal.
Approved by: re (kensmith) (blanket)
|
171771 |
07-Aug-2007 |
bde |
Fix and update the comments about the effect of the read-only flag on writing. They are still too verbose.
Remove nearby unreachable code for handling symlinks.
Approved by: re (kensmith) (blanket)
|
171759 |
07-Aug-2007 |
bde |
Fix some style bugs (don't assume that off_t == int64_t; fix some comments; remove some parentheses; fix some whitespace errors; fix only one case of a boolean comparison of a non-boolean).
Improve an error message by quoting ".", and by not printing large positive values as negative ones.
Approved by: re (kensmith) (blanket)
|
171758 |
07-Aug-2007 |
bde |
Fix some style bugs (don't assume that off_t == int64_t; fix some comments; remove some parentheses; fix only a couple of whtespace errors).
Approved by: re (kensmith) (blanket)
|
171757 |
07-Aug-2007 |
bde |
Fix some style bugs (mainly some whitespace errors).
Approved by: re (kensmith) (blanket)
|
171756 |
07-Aug-2007 |
bde |
Fix some style bugs (some whitespace errors only).
Approved by: re (kensmith) (blanket)
|
171755 |
07-Aug-2007 |
bde |
Sort includes.
Remove rotted banal comment attached to includes.
Approved by: re (kensmith) (blanket)
|
171754 |
07-Aug-2007 |
bde |
Sort includes.
Remove banal comments attached to includes.
Approved by: re (kensmith) (blanket)
|
171752 |
07-Aug-2007 |
bde |
Sort includes.
Remove banal comments before includes. Remove rotted banal comments attached to includes.
Approved by: re (kensmith) (blanket)
|
171751 |
07-Aug-2007 |
bde |
Remove unused include(s).
Remove banal comments before includes.
Approved by: re (kensmith) (blanket)
|
171750 |
07-Aug-2007 |
bde |
Remove unused include(s).
Approved by: re (kensmith) (blanket)
|
171749 |
07-Aug-2007 |
bde |
Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of depending on namespace pollution in <sys/buf.h> and/or <sys/vnode.h>
Approved by: re (kensmith) (blanket)
|
171748 |
07-Aug-2007 |
bde |
Include <sys/mutex.h>'s prerequisite <sys/lock.h> instead of depending on namespace pollution in <sys/vnode.h>.
Sort the include of <sys/mutex.h> instead of unsorting it after <sys/vnode.h> and depending on the pollution there.
Approved by: re (kensmith) (blanket)
|
171747 |
07-Aug-2007 |
bde |
Remove unused include(s).
Approved by: re (kensmith) (blanket)
|
171731 |
05-Aug-2007 |
bde |
Silently fix up the estimated next free cluster number from the fsinfo sector, instead of failing the whole mount if it is garbage. Fields in the fsinfo sector are only advisory, so there are better sanity checks than this, and we already silently fix up the only other advisory field in the fsinfo (the free cluster count).
This wasn't handled quite right in rev.1.92, 1.117, or in NetBSD. 1.92 also failed the whole mount for the non-garbage magic value 0xffffffff 1.117 fixed this well enough in practice since garbage values shouldn't occur in practice, but left the error handling larger and more convoluted than necessary. Now we handle the magic value as a special case of fixing up all out of bounds values.
Also fix up the estimated next free cluster number when there is no fsinfo sector. We were using 0, but CLUST_FIRST is safer.
Approved by: re (kensmith)
|
171711 |
03-Aug-2007 |
bde |
Oops, fix the fix for the i/o size of the fsinfo block. Its log message explained why the size is 1 sector, but the code used a size of 1 cluster.
I/o sizes larger than necessary may cause serious coherency problems in the buffer cache. Here I think there were only minor efficiency problems, since a too-large fsinfo buffer could only get far enough to overlap buffers for the same vnode (the device vnode), so mappings are coherent at the page level although not at the buffer level, and the former is probably enough due to our limited use of the fsinfo buffer.
Approved by: re (kensmith)
|
171704 |
03-Aug-2007 |
delphij |
MFp4 - Refine locking to eliminate some potential race/panics:
- Copy before testing a pointer. This closes a race window. - Use msleep with the node interlock instead of tsleep. - Do proper locking around access to tn_vpstate. - Assert vnode VOP lock for dir_{atta,de}tach to capture inconsistent locking.
Suggested by: kib Submitted by: delphij Reviewed by: Howard Su Approved by: re (tmpfs blanket)
|
171599 |
26-Jul-2007 |
pjd |
When we do open, we should lock the vnode exclusively. This fixes few races: - fifo race, where two threads assign v_fifoinfo, - v_writecount modifications, - v_object modifications, - and probably more...
Discussed with: kib, ups Approved by: re (rwatson)
|
171570 |
24-Jul-2007 |
delphij |
MFp4: Force 64-bit arithmatic when caculating the maximum file size. This fixes tmpfs caculations on 32-bit systems equipped with more than 4GB swap.
Reported by: Craig Boston <craig xfoil gank org> PR: kern/114870 Approved by: re (tmpfs blanket)
|
171551 |
23-Jul-2007 |
bde |
Make using msdosfs as the root file system sort of work:
o Initialize ownerships and permissions. They were garbage (0) for root mounts since vfs_mountroot_try() doesn't ask for them to be set and msdosfs's old incomplete code to set them was removed. The garbage happened to give the correct ownerships root:wheel, but it gave permissions 000 so init could not be execed. Use the macros for root: wheel and 0755. (The removed code gave 0:0 and 0777. 0755 is more normal and secure, thought wrong for /tmp.)
o Check the readonly flag for initial (non-MNT_UPDATE) mounts in the correct place, as in ffs. For root mounts, it is only passed in mp->mnt_flags, since vfs_mountroot_try() only passes it as a flag and nothing translates the flag to the "ro" option string. msdosfs only looked for it in the string, so it gave a rw mount for root mounts without even clearing the flag in mp->mnt_flags, so the final state was inconsistent. Checking the flag only in mp->mnt_flags works for initial userland mounts too. The MNT_UPDATE case is messier.
The main point that should work but doesn't is fsck of msdosfs root while it is mounted ro. This needs mainly MNT_RELOAD support to work. It should be possible to run fsck -p and succeed provided the fs is consistent, not just for msdosfs, but this fails because fsck -p always tries to open the device rw. The hack that allows open for writing in ffs is not implemented in msdosfs, since without MNT_RELOAD support writing could only be harmful. So fsck must be turned off to use msdosfs as root. This is quite dangerous, since msdosfs is still missing actually using its fs-dirty flag internally, so it is happy to mount dirty fileystems rw.
Unrelated changes: - Fix missing error handling for MNT_UPDATE from rw to ro. - Catch up with renaming msdos to msdosfs in a string.
Approved by: re (kensmith)
|
171550 |
23-Jul-2007 |
delphij |
MFp4: When swapping is not enabled, allow creating files by taking physical memory pages into account for tm_maxfilesize.
Reported by: Dominique Goncalves <dominique.goncalves gmail.com> Submitted by: Howard Su Approved by: re (tmpfs blanket)
|
171523 |
20-Jul-2007 |
bde |
Implement vfs clustering for msdosfs.
This gives a very large speedup for small block sizes (in my tests, about 5 times for write and 3 times for read with a block size of 512, if clustering is possible) and a moderate speedup for the moderatatly large block sizes that should be used on non-small media (4K is the best size in most cases, and the speedup for that is about 1.3 times for write and 1.2 times for read). mmap() should benefit from clustering like read()/write(), but the current implementation of vm only supports clustering (at least for getpages) if the fs block size is >= PAGE SIZE.
msdosfs is now only slightly slower than ffs with soft updates for writing and slightly faster for reading when both use their best block sizes. Writing is slower for msdosfs because of more sync writes. Reading is faster for msdosfs because indirect blocks interfere with clustering in ffs.
The changes in msdosfs_read() and msdosfs_write() are simpler merges of corresponding code in ffs (after fixing some style bugs in ffs). msdosfs_bmap() needs fs-specific code. This implementation loops calling a lower level bmap function to do the hard parts. This is a bit inefficient, but is efficient enough since msdsfs_bmap() is only called when there is physical i/o to do.
Approved by: re (hrs)
|
171522 |
20-Jul-2007 |
bde |
Clean up before implementing vfs clustering for msdosfs:
In msdosfs_read(), mainly reorder the main loop to the same order as in ffs_read().
In msdosfs_write() and extendfile(), use vfs_bio_clrbuf() instead of clrbuf(). I think this just just a bogus optimization, but ffs always does it and msdosfs already did it in one place, and it is what I've tested.
In msdosfs_write(), merge good bits from a comment in ffs_write(), and fix 1 style bug.
In the main comment for msdosfs_pcbmap(), improve wording and catch up with 13 years of changes in the function. This comment belongs in VOP_BMAP.9 but that doesn't exist.
In msdosfs_bmap(), return EFBIG if the requested cluster number is out of bounds instead of blindly truncating it, and fix many style bugs.
Approved by: re (hrs)
|
171518 |
20-Jul-2007 |
rwatson |
Make sure we release the control vnode in Coda:
We allocate coda_ctlvp when /coda is mounted, but never release it. During the unmount this vnode was marked as UNMOUNTING and when venus is started a second time the system would hang, possibly waiting for the old vnode to disappear.
So now we call vrele on the control vnode when file system is unmounted to drop the reference we got during the mount. I'm pretty sure it is also necessary to not skip the handling in coda_inactive for the control vnode, it seems like that is the place we actually get rid of the vnode once the refcount has dropped to 0.
Submitted by: Jan Harkes <jaharkes at cs dot cmu dot edu> Approved by: re (kensmith)
|
171489 |
19-Jul-2007 |
delphij |
MFp4: Rework on tmpfs's mapped read/write procedures. This should finally fix fsx test case.
The printf's added here would be eventually turned into assertions.
Submitted by: Mingyan Guo (mostly) Approved by: re (tmpfs blanket)
|
171416 |
12-Jul-2007 |
rwatson |
Complete repo-copy and move of Coda from src/sys/coda to src/sys/fs/coda by removing files from src/sys/coda, and updating include paths in the new location, kernel configuration, and Makefiles. In one case add $FreeBSD$.
Discussed with: anderson, Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith) Repo-copy madness: simon
|
171414 |
12-Jul-2007 |
rwatson |
Forced commit to recognize repo-copy of Coda files from src/sys/coda to src/sys/fs/coda.
Discussed with: anderson, Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith) Repo-copy madness: simon
|
171408 |
12-Jul-2007 |
bde |
Round up the FAT block size to a multiple of the sector size so that i/o to the FAT is possible.
Make the FAT block size less arbitrary before it is rounded up: - for FAT12, default to 3*512 instead of to 3 sectors. The magic 3 is the default number of 512-byte FAT sectors on a floppy drive. That many sectors is too many if the sector size is larger. - for !FAT12, default to PAGE_SIZE instead of to 4096. Remove MSDOSFS_DFLTBSIZE since it only obfuscated this 4096.
For reading the BPB, use a block size of 8192 instead of 2048 so that sector sizes up to 8192 can work. We should try several sizes, or just try the maximum supported size (MAXBSIZE = 64K). I use 8192 because that is enough for DVD-RW's (even 2048 is enough) and 8192 has been tested a lot in use by ffs.
This completes fixing msdosfs for some large sector sizes (up to 8K for read and 64K for write). Microsoft documents support for sector sizes up to 4K in mdosfs. ffs is currently limited to 8K for both read and write.
Approved by: re (kensmith) Approved by: nyan (several years ago)
|
171406 |
12-Jul-2007 |
bde |
Fix some bugs involving the fsinfo block (many remain unfixed). This is part of fixing msdosfs for large sector sizes. One of the fixed bugs was fatal for large sector sizes.
1. The fsinfo block has size 512, but it was misunderstood and declared as having size 1024, with nothing in the second 512 bytes except a signature at the end. The second 512 bytes actually normally (if the file system was created by Windows) consist of a second boot sector which is normally (in WinXP) empty except for a signature -- the normal layout is one boot sector, one fsinfo sector, another boot sector, then these 3 sectors duplicated. However, other layouts are valid. newfs_msdos produces a valid layout with one boot sector, one fsinfo sector, then these 2 sectors duplicated. The signature check for the extra part of the fsinfo was thus normally checking the signature in either the second boot sector or the first boot sector in the copy, and thus accidentally succeeding. The extra signature check would just fail for weirder layouts with 512-byte sectors, and for normal layouts with any other sector size.
Remove the extra bytes and the extra signature check.
2. Old versions did i/o to the fsinfo block using size 1024, with the second half only used for the extra signature check on read. This was harmless for sector size 512, and worked accidentally for sector size 1024. The i/o just failed for larger sector sizes.
The version being fixed did i/o to the fsinfo block using size fsi_size(pmp) = (1024 << ((pmp)->pm_BlkPerSec >> 2)). This expression makes no sense. It happens to work for sector small sector sizes, but for sector size 32K it gives the preposterous value of 64M and thus causes panics. A sector size of 32768 is necessary for at least some DVD-RW's (where the minimum write size is 32768 although the minimum read size is 2048).
Now that the size of the fsinfo block is 512, it always fits in one sector so there is no need for a macro to express it. Just use the sector size where the old code uses 1024.
Approved by: re (kensmith) Approved by: nyan (several years ago for a different version of (2))
|
171379 |
11-Jul-2007 |
rwatson |
Fix ioctls on the control vnode: ioctls on a character device fail with ENOTTY. Make the control vnode a regular file so that ioctls are passed through to our kernel module.
Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)
|
171378 |
11-Jul-2007 |
rwatson |
Avoid a panic in insmntque when we pass a NULL mount: this reenables some previously disabled code which according to the comment caused a problem during shutdown. But even that is still better than triggering a kernel panic whenever venus is started.
Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)
|
171377 |
11-Jul-2007 |
rwatson |
Replace CODA_OPEN with CODA_OPEN_BY_FD: coda_open was disabled because we can't open container files by device/inode number pair anymore. Replace the CODA_OPEN upcall with CODA_OPEN_BY_FD, where venus returns an open file descriptor for the container file. We can then grab a reference on the vnode coda_psdev.c:vc_nb_write and use this vnode for further accesses to the container file.
Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)
|
171375 |
11-Jul-2007 |
rwatson |
Resolve Coda mount failing because Coda failed to match the device operations. But we don't have to, if we find the coda_mntinfo structure for this device in our linked list, we know the device is good.
Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)
|
171374 |
11-Jul-2007 |
rwatson |
Avoid crash when opening Coda device: when allocating coda_mntinfo, we need to initialize dev so that we can actually find the allocated coda_mntinfo structure later on.
Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)
|
171362 |
11-Jul-2007 |
delphij |
MFp4: Make use of the kernel unit number allocation facility for tmpfs nodes.
Submitted by: Mingyan Guo <guomingyan gmail com> Approved by: re (tmpfs blanket)
|
171343 |
10-Jul-2007 |
bde |
Don't use almost perfectly pessimal cluster allocation. Allocation of the the first cluster in a file (and, if the allocation cannot be continued contiguously, for subsequent clusters in a file) was randomized in an attempt to leave space for contiguous allocation of subsequent clusters in each file when there are multiple writers. This reduced internal fragmentation by a few percent, but it increased external fragmentation by up to a few thousand percent.
Use simple sequential allocation instead. Actually maintain the fsinfo sequence index for this. The read and write of this index from/to disk still have many non-critical bugs, but we now write an index that has something to do with our allocations instead of being modified garbage. If there is no fsinfo on the disk, then we maintain the index internally and don't go near the bugs for writing it.
Allocating the first free cluster gives a layout that is almost as good (better in some cases), but takes too much CPU if the FAT is large and the first free cluster is not near the beginning.
The effect of this change for untar and tar of a slightly reduced copy of /usr/src on a new file system was:
Before (msdosfs 4K-clusters): untar: 459.57 real untar from cached file (actually a pipe) tar: 342.50 real tar from uncached tree to /dev/zero Before (ffs2 soft updates 4K-blocks 4K-frags) untar: 39.18 real tar: 29.94 real Before (ffs2 soft updates 16K-blocks 2K-frags) untar: 31.35 real tar: 18.30 real
After (msdosfs 4K-clusters): untar 54.83 real tar 16.18 real
All of these times can be improved further.
With multiple concurrent writers or readers (especially readers), the improvement is smaller, but I couldn't find any case where it is negative. 342 seconds for tarring up about 342 MB on a ~47MB/S partition is just hard to unimprove on. (This operation would take about 7.3 seconds with reasonably localized allocation and perfect read-ahead.) However, for active file systems, 342 seconds is closer to normal than the 16+ seconds above or the 11 seconds with other changes (best I've measured -- won easily by msdosfs!). E.g., my active /usr/src on ffs1 is quite old and fragmented, so reading to prepare for the above benchmark takes about 6 times longer than reading back the fresh copies of it.
Approved by: re (kensmith)
|
171308 |
08-Jul-2007 |
delphij |
MFp4: - Plug memory leak. - Respect underlying vnode's properties rather than assuming that the user want root:wheel + 0755. Useful for using tmpfs(5) for /tmp. - Use roundup2 and howmany macros instead of rolling our own version. - Try to fix fsx -W -R foo case. - Instead of blindly zeroing a page, determine whether we need a pagein order to prevent data corruption. - Fix several bugs reported by Coverity.
Submitted by: Mingyan Guo <guomingyan gmail com>, Howard Su, delphij Coverity ID: CID 2550, 2551, 2552, 2557 Approved by: re (tmpfs blanket)
|
171181 |
03-Jul-2007 |
kib |
Since rev. 1.199 of sys/kern/kern_conf.c, the thread that calls destroy_dev() from d_close() cdev method would self-deadlock. devfs_close() bump device thread reference counter, and destroy_dev() sleeps, waiting for si_threadcount to reach zero for cdev without d_purge method.
destroy_dev_sched() could be used instead from d_close(), to schedule execution of destroy_dev() in another context. The destroy_dev_sched_drain() function can be used to drain the scheduled calls to destroy_dev_sched(). Similarly, drain_dev_clone_events() drains the events clone to make sure no lingering devices are left after dev_clone event handler deregistered.
make_dev_credf(MAKEDEV_REF) function should be used from dev_clone event handlers instead of make_dev()/make_dev_cred() to ensure that created device has reference counter bumped before cdev mutex is dropped inside make_dev().
Reviewed by: tegge (early versions), njl (programming interface) Debugging help and testing by: Peter Holm Approved by: re (kensmith)
|
171087 |
29-Jun-2007 |
delphij |
MFp4:
- Remove unnecessary NULL checks after M_WAITOK allocations. - Use VOP_ACCESS instead of hand-rolled suser_cred() calls. [1] - Use malloc(9) KPI to allocate memory for string. The optimization taken from NetBSD is not valid for FreeBSD because our malloc(9) already act that way. [2]
Requested by: rwatson [1] Submitted by: Howard Su [2] Approved by: re (tmpfs blanket)
|
171070 |
28-Jun-2007 |
delphij |
Space/style cleanups after last set of commits.
Approved by: re (tmpfs blanket)
|
171069 |
28-Jun-2007 |
delphij |
Staticify most of fifo/vn operations, they should not be directly exposed outside.
Approved by: re (tmpfs blanket)
|
171068 |
28-Jun-2007 |
delphij |
Use vfs_timestamp instead of nanotime when obtaining a timestamp for use with timekeeping.
Approved by: re (tmpfs blanket)
|
171067 |
28-Jun-2007 |
delphij |
Reorder tf_gen and tf_id in struct tmpfs_fid. This saves 8 bytes on amd64 architecture.
Obtained from: NetBSD Approved by: re (tmpfs blanket)
|
171040 |
26-Jun-2007 |
delphij |
Remove two function prototypes that are no longer used.
Approved by: re (tmpfs blanket)
|
171038 |
26-Jun-2007 |
delphij |
- Sync with NetBSD's RCSID (HEAD preferred). - Correct a typo.
Approved by: re (tmpfs blanket)
|
171029 |
25-Jun-2007 |
delphij |
MFp4: Several clean-ups and improvements over tmpfs:
- Remove tmpfs_zone_xxx KPI, the uma(9) wrapper, since they does not bring any value now. - Use |= instead of = when applying VV_ROOT flag. - Remove tm_avariable_nodes list. Use uma to hold the released nodes. - init/destory interlock mutex of node when init/fini instead of ctor/dtor. - Change memory computing using u_int to fix negative value in 2G mem machine. - Remove unnecessary bzero's - Rely uma logic to make file id allocation harder to guess. - Fix some unsigned/signed related things. Make sure we respect -o size=xxxx - Use wire instead of hold a page. - Pass allocate_zero to obtain zeroed pages upon first use.
Submitted by: Howard Su Approved by: re (tmpfs blanket, kensmith)
|
171023 |
25-Jun-2007 |
rafan |
- Remove UMAP filesystem. It was disconnected from build three years ago, and it is seriously broken.
Discussed on: freebsd-arch@ Approved by: re (mux)
|
170922 |
18-Jun-2007 |
delphij |
Use vfs_timestamp() instead of nanotime() - make it up to the user to make decisions about how detail they wanted timestamps to have.
|
170903 |
18-Jun-2007 |
delphij |
MFp4: fix two locking problems:
- Hold TMPFS_LOCK while updating tm_pages_used. - Hold vm page while doing uiomove.
This will hopefully fix all known panics.
Submitted by: Howard Su
|
170808 |
16-Jun-2007 |
delphij |
MFp4: Add tmpfs, an efficient memory file system.
Please note that, this is currently considered as an experimental feature so there could be some rough edges. Consult http://wiki.freebsd.org/TMPFS for more information.
For now, connect tmpfs to build on i386 and amd64 architectures only. Please let us know if you have success with other platforms.
This work was developed by Julio M. Merino Vidal for NetBSD as a SoC project; Rohit Jalan ported it from NetBSD to FreeBSD. Howard Su and Glen Leeder are worked on it to continue this effort.
Obtained from: NetBSD via p4 Submitted by: Howard Su (with some minor changes) Approved by: re (kensmith)
|
170587 |
12-Jun-2007 |
rwatson |
Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present.
Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c.
We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h.
Reviewed by: csjp Obtained from: TrustedBSD Project
|
170577 |
11-Jun-2007 |
remko |
Correct corrupt read when the read starts at a non-aligned offset.
PR: kern/77234 MFC After: 1 week Approved by: imp (mentor) Requested by: many many people Submitted by: Andriy Gapon <avg at icyb dot net dot ua>
|
170472 |
09-Jun-2007 |
attilio |
rufetch and calcru sometimes should be called atomically together. This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way.
Reviewed by: jeff Approved by: jeff (mentor)
|
170401 |
07-Jun-2007 |
bmah |
Fix off-by-one error (introduced in r1.60) that had the effect of disallowing a read of exactly MAXPHYS bytes.
Reviewed by: des, rdivacky MFC after: 1 week Sponsored by: nCircle Network Security
|
170307 |
05-Jun-2007 |
jeff |
Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
170292 |
04-Jun-2007 |
attilio |
Do proper "locking" for missing vmmeters part. Now, we assume no more sched_lock protection for some of them and use the distribuited loads method for vmmeter (distribuited through CPUs).
Reviewed by: alc, bde Approved by: jeff (mentor)
|
170188 |
01-Jun-2007 |
trhodes |
Revert previous, part of NFS that I didn't know about.
|
170184 |
01-Jun-2007 |
trhodes |
Garbage collect msdosfs_fhtovp; it appears unused and I have been using MSDOSFS without this function and problems for the last month.
|
170183 |
01-Jun-2007 |
kib |
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file: part 2. Convert calls missed in the first big commit.
Noted by: rwatson Pointy hat to: kib
|
170170 |
31-May-2007 |
attilio |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately.
Requested by: alc Approved by: jeff (mentor)
|
170152 |
31-May-2007 |
kib |
Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file.
Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)
|
170093 |
29-May-2007 |
rwatson |
Where I previously removed calls to kdb_enter(), now remove include of kdb.h.
Pointed out by: bde
|
170015 |
27-May-2007 |
rwatson |
Rather than entering the debugger via kdb_enter() when detecting memory corruption under SMBUFS_NAME_DEBUG, panic() with the same error message.
|
170014 |
27-May-2007 |
rwatson |
Rather than entering the debugger via kdb_enter() in the event the root vnode is unexpectedly locked under NULLFS_DEBUG in nullfs and then returning EDEADLK, panic.
|
169671 |
18-May-2007 |
kib |
Since renaming of vop_lock to _vop_lock, pre- and post-condition function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.
|
169667 |
18-May-2007 |
jeff |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
|
169168 |
01-May-2007 |
des |
The process lock is held when procfs_ioctl() is called. Assert that this is so, and PHOLD the process while sleeping since msleep() will release the lock.
|
168985 |
23-Apr-2007 |
des |
Fix old locking bugs which were revealed when pseudofs was made MPSAFE.
Submitted by: tegge
|
168977 |
23-Apr-2007 |
rwatson |
Rename mac*devfsdirent*() to mac*devfs*() to synchronize with SEDarwin, where similar data structures exist to support devfs and the MAC Framework, but are named differently.
Obtained from: TrustedBSD Project Sponsored by: SPARTA, Inc.
|
168968 |
23-Apr-2007 |
alc |
Add synchronization. Eliminate the acquisition and release of Giant.
Reviewed by: tegge
|
168884 |
20-Apr-2007 |
trhodes |
In some cases, like whenever devfs file times are zero, the fix(aa) will not be applied to dev entries. This leaves us with file times like "Jan 1 1970." Work around this problem by replacing the tv_sec == 0 check with a <= 3600 check. It's doubtful anyone will be booting within an hour of the Epoch, let alone care about a few seconds worth of nonzero timestamps. It's a hackish work around, but it does work and I have not experienced any negatives in my testing.
Discussed with: bde "Ok with me: phk
|
168768 |
15-Apr-2007 |
des |
Avoid "unused variable" warning when building without PSEUDOFS_TRACE.
|
168764 |
15-Apr-2007 |
des |
Make pseudofs (and consequently procfs, linprocfs and linsysfs) MPSAFE.
|
168763 |
15-Apr-2007 |
des |
Instead of stating GIANT_REQUIRED, just acquire and release Giant where needed. This does not make a difference now, but will when procfs is marked MPSAFE.
|
168759 |
15-Apr-2007 |
des |
Fix the same bug as in procfs_doproc{,db}regs(): check that uio_offset is 0 upon entry, and don't reset it before returning.
MFC after: 3 weeks
|
168758 |
15-Apr-2007 |
des |
Don't reset uio_offset to 0 before returning. Instead, refuse to service requests where uio_offset is not 0 to begin with. This fixes a long- standing bug where e.g. 'cat /proc/$$/regs' would loop forever.
MFC after: 3 weeks
|
168720 |
14-Apr-2007 |
des |
Further pseudofs improvements:
The pfs_info mutex is only needed to lock pi_unrhdr. Everything else in struct pfs_info is modified only while Giant is held (during vfs_init() / vfs_uninit()); add assertions to that effect.
Simplify pfs_destroy somewhat.
Remove superfluous arguments from pfs_fileno_{alloc,free}(), and the assertions which were added in the previous commit to ensure they were consistent.
Assert that Giant is held while the vnode cache is initialized and destroyed. Also assert that the cache is empty when it is destroyed.
Rename the vnode cache mutex for consistency.
Fix a long-standing bug in pfs_getattr(): it would uncritically return the node's pn_fileno as st_ino. This would result in st_ino being 0 if the node had not previously been visited by readdir(), and also in an incorrect st_ino for process directories and any files contained therein. Correct this by abstracting the fileno manipulations previously done in pfs_readdir() into a new function, pfs_fileno(), which is used by both pfs_getattr() and pfs_readdir().
|
168637 |
11-Apr-2007 |
des |
Add a flag to struct pfs_vdata to mark the vnode as dead (e.g. process- specific nodes when the process exits)
Move the vnode-cache-walking loop which was duplicated in pfs_exit() and pfs_disable() into its own function, pfs_purge(), which looks for vnodes marked as dead and / or belonging to the specified pfs_node and reclaims them. Note that this loop is still extremely inefficient.
Add a comment in pfs_vncache_alloc() explaining why we have to purge the vnode from the vnode cache before returning, in case anyone should be tempted to remove the call to cache_purge().
Move the special handling for pfstype_root nodes into pfs_fileno_alloc() and pfs_fileno_free() (the root node's fileno must always be 2). This also fixes a bug where pfs_fileno_free() would reclaim the root node's fileno, triggering a panic in the unr code, as that fileno was never allocated from unr to begin with.
When destroying a pfs_node, release its fileno and purge it from the vnode cache. I wish we could put off the call to pfs_purge() until after the entire tree had been destroyed, but then we'd have vnodes referencing freed pfs nodes. This probably doesn't matter while we're still under Giant, but might become an issue later.
When destroying a pseudofs instance, destroy the tree before tearing down the fileno allocator.
In pfs_mount(), acquire the mountpoint interlock when required.
MFC after: 3 weeks
|
168387 |
05-Apr-2007 |
des |
Whitespace nits.
|
168355 |
04-Apr-2007 |
rwatson |
Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead.
- Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks.
- Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively.
- Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb).
- Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date.
In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio).
Tested by: kris Discussed with: jhb, kris, attilio, jeff
|
167916 |
26-Mar-2007 |
kris |
Annotate that this giant acqusition is dependent on tty locking.
|
167875 |
24-Mar-2007 |
maxim |
o cd9660 code repo-copied, update a comment.
|
167497 |
13-Mar-2007 |
tegge |
Make insmntque() externally visibile and allow it to fail (e.g. during late stages of unmount). On failure, the vnode is recycled.
Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure.
Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE.
Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility.
Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup.
Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib
|
167482 |
12-Mar-2007 |
des |
Add a pn_destroy field to pfs_node. This field points to a destructor function which is called from pfs_destroy() before the node is reclaimed.
Modify pfs_create_{dir,file,link}() to accept a pointer to a destructor function in addition to the usual attr / fill / vis pointers.
This breaks both the programming and binary interfaces between pseudofs and its consumers. It is believed that there are no pseudofs consumers outside the source tree, so that the impact of this change is minimal.
Submitted by: Aniruddha Bohra <bohra@cs.rutgers.edu>
|
167158 |
02-Mar-2007 |
mpp |
Change fifo_printinfo to check if the vnode v_fifoinfo pointer is NULL and print a message to that effect to prevent a panic.
|
167086 |
27-Feb-2007 |
jhb |
Use pause() rather than tsleep() on stack variables and function pointers.
|
166858 |
21-Feb-2007 |
cognet |
Check that the error returned by vfs_getopts() is not ENOENT before assuming there's actually an error. This is just in order to unbreak ntfs on current, before a proper solution is committed.
|
166826 |
19-Feb-2007 |
rwatson |
Do allow PIOCSFL in jail for setguid processes; this is more consistent with other debugging checks elsewhere. XXX comment on the fact that p_candebug() is not being used here remains.
|
166774 |
15-Feb-2007 |
pjd |
Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic.
Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation.
VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE.
Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs
|
166639 |
11-Feb-2007 |
rodrigc |
Forced commit and #include changes for repo copy from sys/isofs/cd9660 to sys/fs/cd9660.
Discussed on freebsd-current.
|
166559 |
08-Feb-2007 |
rodrigc |
Add noatime to the list of mount options that msdosfs accepts.
PR: 108896 Submitted by: Eugene Grosbein <eugen grosbein pp ru>
|
166558 |
08-Feb-2007 |
rodrigc |
Style fixes: use ANSI C function declarations.
|
166548 |
07-Feb-2007 |
kib |
Fix the race of dereferencing /proc/<pid>/file with execve(2) by caching the value of p_textvp. This way, we always unlock the locked vnode. While there, vhold() the vnode around the vn_lock().
Reported and tested by: Guy Helmer (ghelmer palisadesys com) Approved by: des (procfs maintainer) MFC after: 1 week
|
166524 |
06-Feb-2007 |
rodrigc |
Eliminate some dead code which was introduced in 1.23, yet was always commented out.
|
166429 |
02-Feb-2007 |
pjd |
coda_vptofh is never defined nor used.
|
166343 |
30-Jan-2007 |
avatar |
Fixing compilation bustage by removing references to opt_msdosfs.h.
This auto-generated header file no longer exists since the removal of MSDOSFS_LARGE in sys/conf/options:1.574.
|
166341 |
30-Jan-2007 |
trhodes |
Fix spacing from my previous commit to this file:
Noticed by: fjoe
|
166340 |
30-Jan-2007 |
rodrigc |
Add a "-o large" mount option for msdosfs. Convert compile-time checks for #ifdef MSDOSFS_LARGE to run-time checks to see if "-o large" was specified.
Test case provided by Oliver Fromme: truncate -s 200G test.img mdconfig -a -t vnode -f test.img -u 9 newfs_msdos -s 419430400 -n 1 /dev/md9 zip250 mount -t msdosfs /dev/md9 /mnt # should fail mount -t msdosfs -o large /dev/md9 /mnt # should succeed
PR: 105964 Requested by: Oliver Fromme <olli lurza secnetix de> Tested by: trhodes MFC after: 2 weeks
|
166167 |
22-Jan-2007 |
kib |
Below is slightly edited description of the LOR by Tor Egge:
-------------------------- [Deadlock] is caused by a lock order reversal in vfs_lookup(), where [some] process is trying to lock a directory vnode, that is the parent directory of covered vnode) while holding an exclusive vnode lock on covering vnode.
A simplified scenario:
root fs var fs / A / (/var) D /var B /log (/var/log) E vfs lock C vfs lock F
Within each file system, the lock order is clear: C->A->B and F->D->E
When traversing across mounts, the system can choose between two lock orders, but everything must then follow that lock order:
L1: C->A->B | +->F->D->E
L2: F->D->E | +->C->A->B
The lookup() process for namei("/var") mixes those two lock orders:
VOP_LOOKUP() obtains B while A is held vfs_busy() obtains a shared lock on F while A and B are held (follows L1, violates L2) vput() releases lock on B VOP_UNLOCK() releases lock on A VFS_ROOT() obtains lock on D while shared lock on F is held vfs_unbusy() releases shared lock on F vn_lock() obtains lock on A while D is held (violates L1, follows L2)
dounmount() follows L1 (B is locked while F is drained).
Without unmount activity, vfs_busy() will always succeed without blocking and the deadlock isn't triggered (the system behaves as if L2 is followed).
With unmount, you can get 4 processes in a deadlock:
p1: holds D, want A (in lookup()) p2: holds shared lock on F, want D (in VFS_ROOT()) p3: holds B, want drain lock on F (in dounmount()) p4: holds A, want B (in VOP_LOOKUP())
You can have more than one instance of p2.
The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode.
- Tor Egge
To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp is actually not used by the callers of namei. Thus, placeholder deadfs vnode vp_crossmp is introduced that is filled into ni_dvp.
Idea by: ups Reviewed by: tegge, ups, jeff, rwatson (mac interaction) Tested by: Peter Holm MFC after: 2 weeks
|
166062 |
16-Jan-2007 |
trhodes |
Add a 3rd entry in the cache, which keeps the end position from just before extending a file. This has the desired effect of keeping the write speed constant. And yes, that helps a lot copying large files always at full speed now, and I have seen improvements using benchmarks/bonnie.
Stolen from: NetBSD Reviewed by: bde
|
166030 |
15-Jan-2007 |
pav |
Rewrite the udf_read() routine to use a file vnode instead of the devvp vnode. The code is modelled after cd9660, including support for simple read-ahead courtesy of clustered read.
Fix udf_strategy to DTRT.
This change fixes sendfile(2) not to send out garbage.
Reviewed by: scottl MFC after: 1 month
|
165879 |
07-Jan-2007 |
pav |
Tell backing v_object the filesize right on it's creation.
MFC after: 1 week
|
165836 |
06-Jan-2007 |
rodrigc |
When performing a mount update to change a mount from read-only to read-write, do not call markvoldirty() until the mount has been flagged as read-write. Due to the nature of the msdosfs code, this bug only seemed to appear for FAT-16 and FAT-32.
This fixes the testcase: #!/bin/sh dd if=/dev/zero bs=1m count=1 oseek=119 of=image.msdos mdconfig -a -t vnode -f image.msdos newfs_msdos -F 16 /dev/md0 fd120m mount_msdosfs -o ro /dev/md0 /mnt mount | grep md0 mount -u -o rw /dev/md0; echo $? mount | grep md0 umount /mnt mdconfig -d -u 0
PR: 105412 Tested by: Eugene Grosbein <eugen grosbein pp ru>
|
165804 |
05-Jan-2007 |
rodrigc |
Simplify code in union_hashins() and union_hashget() functions. These functions now more closely resemble similar functions in nullfs. This also eliminates some errors.
Submitted by: daichi, Masanori OZAWA <ozawa ongs co jp>
|
165792 |
05-Jan-2007 |
rodrigc |
Eliminate obsolete comment, now that getushort() is implemented in terms of functions in <sys/endian.h>.
|
165785 |
05-Jan-2007 |
rodrigc |
Eliminate ASSERT_VOP_ELOCKED panics when doing mkdir or symlink when sysctl vfs.lookup_shared=1.
Submitted by: daichi, Masanori OZAWA <ozawa ongs co jp>
|
165737 |
02-Jan-2007 |
jhb |
Use the vnode interlock to close a race where pfs_vncache_alloc() could attempt to vn_lock() a destroyed vnode resulting in a hang.
MFC after: 1 week Submitted by: ups Reviewed by: des
|
165500 |
23-Dec-2006 |
pav |
Call vnode_create_vobject() in VOP_OPEN. Makes mmap work on UDF filesystem.
PR: kern/92040 Approved by: scottl MFC after: 1 week
|
165431 |
21-Dec-2006 |
marcel |
Unbreak 64-bit little-endian systems that do require alignment. The fix involves using le16dec(), le32dec(), le16enc() and le32enc(). This eliminates invalid casts and duplicated logic.
|
165342 |
19-Dec-2006 |
rodrigc |
For big-endian version of getulong() macro, cast result to u_int32_t. This macro was written expecting a 32-bit unsigned long, and doesn't work properly on 64-bit systems. This bug caused vn_stat() to return incorrect values for files larger than 2gb on msdosfs filesystems on 64-bit systems.
PR: 106703 Submitted by: Axel Gonzalez <loox e-shell net> MFC after: 3 days
|
165341 |
19-Dec-2006 |
rodrigc |
Fix get_ulong() macro on AMD64 (or any little-endian 64-bit platform). This bug caused vn_stat() to fail on files larger than 2gb on msdosfs filesystems on AMD64.
PR: 106703 Tested by: Axel Gonzalez <loox e-shell net> MFC after: 3 days
|
165037 |
09-Dec-2006 |
rodrigc |
Remove unused variable in unionfs_root().
Submitted by: daichi, Masanori OZAWA
|
165036 |
09-Dec-2006 |
rodrigc |
Use vfs_mount_error() in a few places to give more descriptive mount error messages.
|
165035 |
09-Dec-2006 |
rodrigc |
Add locking around calls to unionfs_get_node_status() in unionfs_ioctl() and unionfs_poll().
Submitted by: daichi, Masanori OZAWA <ozawa@ongs.co.jp> Prompted by: kris
|
165034 |
09-Dec-2006 |
rodrigc |
In unionfs_readdir(), prevent a possible NULL dereference.
CID: 1667 Found by: Coverity Prevent (tm)
|
165033 |
09-Dec-2006 |
rodrigc |
In unionfs_hashrem(), use LIST_FOREACH_SAFE when iterating over the list of nodes to free them.
CID: 1668 Found by: Coverity Prevent (tm)
|
165022 |
09-Dec-2006 |
rodrigc |
Minor cleanup. If we are doing a mount update, and we pass in an "export" flag indicating that we are trying to NFS export the filesystem, and the MSDOSFS_LARGEFS flag is set on the filesystem, then deny the mount update and export request. Otherwise, let the full mount update proceed normally. MSDOSFS_LARGES and NFS don't mix because of the way inodes are calculated for MSDOSFS_LARGEFS.
MFC after: 3 days
|
165005 |
08-Dec-2006 |
kientzle |
The ISO9660 spec does allow files up to 4G. Change the i_size field to "unsigned long" so that it actually works. Thanks to Robert Sciuk for sending me a DVD that demonstrated ISO9660-formatted media with a file >2G. I've now fixed this both in libarchive and in the cd9660 filesystem.
MFC after: 14 days
|
164936 |
06-Dec-2006 |
julian |
Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it.
Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable.
The ULE scheduler compiles again but I have no idea if it works.
The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit.
Tested by David Xu, and Dan Eischen using libthr and libpthread.
|
164855 |
03-Dec-2006 |
maxim |
o Do not leave uninitialized birthtime: in MSDOSFSMNT_LONGNAME set birthtime to FAT CTime (creation time) and in the other cases set birthtime to -1.
o Set ctime to mtime instead of FAT CTime which has completely different meaning.
PR: kern/106018 Submitted by: Oliver Fromme MFC after: 1 month
|
164836 |
02-Dec-2006 |
rodrigc |
Add missing includes for <sys/buf.h> and <sys/bio.h>.
|
164829 |
02-Dec-2006 |
rodrigc |
Many, many thanks to Masanori OZAWA <ozawa@ongs.co.jp> and Daichi GOTO <daichi@FreeBSD.org> for submitting this major rewrite of unionfs. This rewrite was done to try to solve many of the longstanding crashing and locking issues in the existing unionfs implementation. This implementation also adds a 'MASQUERADE mode', which allows the user to set different user, group, and file permission modes in the upper layer.
Submitted by: daichi, Masanori OZAWA Reviewed by: rodrigc (modified for minor style issues)
|
164627 |
26-Nov-2006 |
maxim |
o From the submitter: dos2unixchr will convert to lower case if LCASE_BASE or LCASE_EXT or both are set. But dos2unixfn uses dos2unixchr separately for the basename and the extension. So if either LCASE_BASE or LCASE_EXT is set, dos2unixfn will convert both the basename and extension to lowercase because it is blindly passing in the state of both flags to dos2unixchr. The bit masks I used ensure that only the state of LCASE_BASE gets passed to dos2unixchr when the basename is converted, and only the state of LCASE_EXT is passed in when the extension is converted.
PR: kern/86655 Submitted by: Micah Lieske MFC after: 3 weeks
|
164450 |
20-Nov-2006 |
le |
Fix an integer overflow and allow access to files larger than 4GB on NTFS.
|
164356 |
17-Nov-2006 |
kib |
Wake up PIOCWAIT handler on the process exit in addition to the stop events. &p->p_stype is explicitely woken up on process exit for us.
Now, truss /nonexistent exits with error instead of waiting until killed by signal.
Reported by: Nikos Vassiliadis nvass at teledomenet gr Reviewed by: jhb MFC after: 1 week
|
164248 |
13-Nov-2006 |
kmacy |
change vop_lock handling to allowing tracking of callers' file and line for acquisition of lockmgr locks
Approved by: scottl (standing in for mentor rwatson)
|
164033 |
06-Nov-2006 |
rwatson |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking.
Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
163993 |
05-Nov-2006 |
bp |
Create a bidirectional mapping of the DOS 'read only' attribute to the 'w' flag.
PR: kern/77958 Submitted by: ghozzy gmail com MFC after: 1 month
|
163709 |
26-Oct-2006 |
jb |
Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE).
Reviewed by: davidxu@
|
163652 |
24-Oct-2006 |
phk |
Ditch crummy fattime <--> timespec conversion functions
|
163651 |
24-Oct-2006 |
phk |
Drop crummy fattime to timespec conversion routines.
Leave a XXX here for anybody able to test.
|
163647 |
24-Oct-2006 |
phk |
Replace slightly crummy fattime<->timespec conversion functions.
|
163606 |
22-Oct-2006 |
rwatson |
Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead.
This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd.
Obtained from: TrustedBSD Project Sponsored by: SPARTA
|
163559 |
21-Oct-2006 |
trhodes |
Fake the link count until we have no choice but to load data from the MFT.
PR: 86965 Submitted by: Lowell Gilbert <lgfbsd@be-well.ilk.org>
|
163530 |
20-Oct-2006 |
kib |
Update the access and modification times for dev while still holding thread reference on it.
Reviewed by: tegge Approved by: pjd (mentor)
|
163529 |
20-Oct-2006 |
kib |
Fix the race between devfs_fp_check and devfs_reclaim. Derefence the vnode' v_rdev and increment the dev threadcount , as well as clear it (in devfs_reclaim) under the dev_lock().
Reviewed by: tegge Approved by: pjd (mentor)
|
163481 |
18-Oct-2006 |
kib |
Properly lock the vnode around vgone() calls.
Unlock the vnode in devfs_close() while calling into the driver d_close() routine.
devfs_revoke() changes by: ups Reviewed and bugfixes by: tegge Tested by: mbr, Peter Holm Approved by: pjd (mentor) MFC after: 1 week
|
162970 |
02-Oct-2006 |
phk |
Use utc_offset() where applicable, and hide the internals of it as static variables.
|
162954 |
02-Oct-2006 |
phk |
First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.
Use libkern's much more time- & spamce-efficient BCD routines.
|
162711 |
27-Sep-2006 |
ru |
Fix our ioctl(2) implementation when the argument is "int". New ioctls passing integer arguments should use the _IOWINT() macro. This fixes a lot of ioctl's not working on sparc64, most notable being keyboard/syscons ioctls.
Full ABI compatibility is provided, with the bonus of fixing the handling of old ioctls on sparc64.
Reviewed by: bde (with contributions) Tested by: emax, marius MFC after: 1 week
|
162647 |
26-Sep-2006 |
tegge |
Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag. This eliminates a race where MNT_UPDATE flag could be lost when nmount() raced against sync(), sync_fsync() or quotactl().
|
162443 |
19-Sep-2006 |
kib |
Fix the bug in rev. 1.134. In devfs_allocv_drop_refs(), when not_found == 2 and drop_dm_lock is true, no unlocking shall be attempted. The lock is already dropped and memory is freed.
Found with: Coverity Prevent(tm) CID: 1536 Approved by: pjd (mentor)
|
162398 |
18-Sep-2006 |
kib |
Resolve the devfs deadlock caused by LOR between devfs_mount->dm_lock and vnode lock in devfs_allocv. Do this by temporary dropping dm_lock around vnode locking.
For safe operation, add hold counters for both devfs_mount and devfs_dirent, and DE_DOOMED flag for devfs_dirent. The facilities allow to continue after dropping of the dm_lock, by making sure that referenced memory does not disappear.
Reviewed by: tegge Tested by: kris Approved by: kan (mentor) PR: kern/102335
|
162255 |
12-Sep-2006 |
imp |
Put the osta.c license on osta.h. The license is the same.
Approved by: scottl@
|
161425 |
17-Aug-2006 |
imp |
while (0); -> while (0) in multi-line macros
|
161125 |
09-Aug-2006 |
alc |
Introduce a field to struct vm_page for storing flags that are synchronized by the lock on the object containing the page.
Transition PG_WANTED and PG_SWAPINPROG to use the new field, eliminating the need for holding the page queues lock when setting or clearing these flags. Rename PG_WANTED and PG_SWAPINPROG to VPO_WANTED and VPO_SWAPINPROG, respectively.
Eliminate the assertion that the page queues lock is held in vm_page_io_finish().
Eliminate the acquisition and release of the page queues lock around calls to vm_page_io_finish() in kern_sendfile() and vfs_unbusy_pages().
|
160964 |
04-Aug-2006 |
yar |
Commit the results of the typo hunt by Darren Pilgrim. This change affects documentation and comments only, no real code involved.
PR: misc/101245 Submitted by: Darren Pilgrim <darren pilgrim bitfreak org> Tested by: md5(1) MFC after: 1 week
|
160939 |
03-Aug-2006 |
delphij |
When the volume is being downgraded from a read-write mode, mark it as clean.
PR: kern/85366 Submitted by: Dan Lukes <dan at obluda dot cz> MFC After: 2 weeks
|
160664 |
25-Jul-2006 |
yar |
In udf_find_partmaps(), when we find a type 1 partition map, we have to skip the actual type 1 length (6 bytes). With this change, it is now possible to correctly spot the VAT partition map in certain discs.
Submitted by: Pedro Martelletto <pedro@ambientworks.net>
|
160489 |
18-Jul-2006 |
jhb |
Update comment.
|
160437 |
17-Jul-2006 |
jhb |
Lock the smb share before doing a 'put' on it in smbfs_unmount().
Tested by: "Jiawei Ye" <leafy7382 at gmail>
|
160425 |
17-Jul-2006 |
phk |
Remove the NDEVFSINO and NDEVFSOVERFLOW options which no longer exists in DEVFS.
Remove the opt_devfs.h file now that it is empty.
|
160310 |
12-Jul-2006 |
ups |
Add vnode interlocking to devfs. This prevents race conditions that can cause pagefaults or devfs to use arbitrary vnodes.
MFC after: 1 week
|
160190 |
08-Jul-2006 |
jhb |
Add a kern_close() so that the ABIs can close a file descriptor w/o having to populate a close_args struct and change some of the places that do.
|
160134 |
06-Jul-2006 |
rwatson |
Remove unneeded mac.h include.
MFC after: 3 days
|
160133 |
06-Jul-2006 |
rwatson |
Remove now unneeded opt_mac.h and mac.h includes.
MFC after: 3 days
|
160132 |
06-Jul-2006 |
rwatson |
Use #include "", not #include <> for opt_foo.h.
MFC after: 3 days
|
159996 |
27-Jun-2006 |
netchild |
Correctly calculate a buffer length. It was off by one so a read() returned one byte less than needed.
This is a RELENG_x_y candidate, since it fixes a problem with Oracle 10.
Noticed by: Dmitry Ganenko <dima@apk-inform.com> Testcase by: Dmitry Ganenko <dima@apk-inform.com> Reviewed by: des Submitted by: rdivacky Sponsored by: Google SoC 2006 MFC after: 1 week
|
159939 |
26-Jun-2006 |
scottl |
Fix a memory leak and a nested 'for' loop in the spare table handling.
Submitted by: Pedro Martelletto
|
159283 |
05-Jun-2006 |
ghelmer |
Upon further review, DES prefers this change over that in revision 1.13 to resolve the directory access problem for processes with P_SUGID flag set.
Suggested by: des
|
159128 |
01-Jun-2006 |
rodrigc |
mount_msdosfs.c: - remove call to getmntopts(), and just pass -o options to nmount(). This removes some confusion as to what options msdosfs can parse, by pushing the responsibility of option parsing to the VFS and FS specific code in the kernel.
msdosfs_vfsops.c: - add "force" and "sync" to msdosfs_opts. They used to be specified in mount_msdosfs.c, so move them here. It's not clear whethere these options should be placed into global_opts in vfs_mount.c or not.
Motivated by: marcus
|
159117 |
31-May-2006 |
cperciva |
Enable inadvertantly disabled "securenet" access controls in ypserv. [1]
Correct a bug in the handling of backslash characters in smbfs which can allow an attacker to escape from a chroot(2). [2]
Security: FreeBSD-SA-06:15.ypserv [1] Security: FreeBSD-SA-06:16.smbfs [2]
|
159023 |
28-May-2006 |
rodrigc |
Remove incorrect null_checkexp() routine. This will allow the NFS server to call vfs_stdcheckexp() on the exported nullfs filesystem, not the underlying filesystem being nullfs mounted. If the lower filesystem was not NFS exported, then the NFS exported null filesystem would not work.
Pointed out by: scottl PR: kern/87906 MFC after: 1 week
|
159019 |
28-May-2006 |
rodrigc |
Modify MNT_UPDATE behavior for nullfs so that it does not return EOPNOTSUPP if an "export" parameter was passed in. This should allow nullfs mounts to be NFS exported.
PR: kern/87906 MFC after: 1 week
|
158927 |
26-May-2006 |
rodrigc |
Remove calls to vfs_export() for exporting a filesystem for NFS mounting from individual filesystems. Call it instead in vfs_mount.c, after we call VFS_MOUNT() for a specific filesystem.
|
158924 |
26-May-2006 |
rodrigc |
Remove calls to vfs_export() for exporting a filesystem for NFS mounting from individual filesystems. Call it instead in vfs_mount.c, after we call VFS_MOUNT() for a specific filesystem.
|
158915 |
25-May-2006 |
ups |
Call vm_object_page_clean() with the object lock held.
Submitted by: kensmith@ Reviewed by: mohans@ MFC after: 6 days
|
158906 |
25-May-2006 |
ups |
Do not set B_NOCACHE on buffers when releasing them in flushbuflist(). If B_NOCACHE is set the pages of vm backed buffers will be invalidated. However clean buffers can be backed by dirty VM pages so invalidating them can lead to data loss. Add support for flush dirty page in the data invalidation function of some network file systems.
This fixes data losses during vnode recycling (and other code paths using invalbuf(*,V_SAVE,*,*)) for data written using an mmaped file.
Collaborative effort by: jhb@,mohans@,peter@,ps@,ups@ Reviewed by: tegge@ MFC after: 7 days
|
158880 |
24-May-2006 |
ghelmer |
Revision 1.4 set access for all sensitive files in /proc/<PID> to mode 0 if a process's uid or gid has changed, but the /proc/<PID> directory itself was also set to mode 0. Assuming this doesn't open any security holes, open access to the /proc/<PID> directory for users other than root to read or search the directory.
Reviewed by: des (back in February) MFC after: 3 weeks
|
158651 |
16-May-2006 |
phk |
Since DELAY() was moved, most <machine/clock.h> #includes have been unnecessary.
|
158611 |
15-May-2006 |
kbyanc |
Restore the ability to mount procfs and fdescfs filesystems via the mount(2) system call:
* Add cmount hook to fdescfs and pseudofs (and, by extension, procfs and linprocfs). This (mostly) restores the ability to mount these filesystems using the old mount(2) system call (see below for the rest of the fix).
* Remove not-NULL check for the data argument from the mount(2) entry point. Per the mount(2) man page, it is up to the individual filesystem being mounted to verify data. Or, in the case of procfs, etc. the filesystem is free to ignore the data parameter if it does not use it. Enforcing data to be not-NULL in the mount(2) system call entry point prevented passing NULL to filesystems which ignored the data pointer value. Apparently, passing NULL was common practice in such cases, as even our own mount_std(8) used to do it in the pre-nmount(2) world.
All userland programs in the tree were converted to nmount(2) long ago, but I've found at least one external program which broke due to this (presumably unintentional) mount(2) API change. One could argue that external programs should also be converted to nmount(2), but then there isn't much point in keeping the mount(2) interface for backward compatibility if it isn't backward compatible.
|
157685 |
12-Apr-2006 |
pjd |
Remove unused prototypes.
|
157342 |
31-Mar-2006 |
jeff |
- Add a bogus vhold/vdrop around vgone() in devfs_revoke. Without this the vnode is never recycled. It is bogus because the reference really should be associated with the devfs dirent.
|
156894 |
19-Mar-2006 |
tegge |
Call vn_start_write() before locking vnode.
|
156732 |
15-Mar-2006 |
rwatson |
Add a_fdidx to comment prototype for fifo_open().
MFC after: 3 days Submitted by: Kostik Belousov <kostikbel at gmail dot com>
|
156714 |
14-Mar-2006 |
rwatson |
If fifo_open() is called with a negative file descriptor, return EINVAL rather than panicking later. This can occur if the kernel calls vn_open() on a fifo, as there will be no associated file descriptor, and therefore the file descriptor operations cannot be modified to point to the fifo operation set.
MFC after: 3 days Reported by: Martin <nakal at nurfuerspam dot de> PR: 94278
|
156693 |
13-Mar-2006 |
joerg |
When encountering a ISO_SUSP_CFLAG_ROOT element in Rock Ridge processing, this actually means there's a double slash recorded in the symbolic link's path name. We used to start over from / then, which caused link targets like ../../bsdi.1.0/include//pathnames.h to be interpreted as /pathnahes.h. This is both contradictionary to our conventional slash interpretation, as well as potentially dangerous.
The right thing to do is (obviously) to just ignore that element.
bde once pointed out that mistake when he noticed it on the 4.4BSD-Lite2 CD-ROM, and asked me for help.
Reviewed by: bde (about half a year ago) MFC after: 3 days
|
156585 |
12-Mar-2006 |
jeff |
- Define a null_getwritemount to get the mount-point for the lower filesystem so that nullfs doesn't permit you to circumvent snapshots.
Discussed with: tegge Sponsored by: Isilon Systems, Inc.
|
156095 |
28-Feb-2006 |
kris |
Correct the vnode locking in fdescfs.
PR: kern/93905 Submitted by: Kostik Belousov <kostikbel@gmail.com> Reviewed by: jeff MFC After: 1 week
|
156062 |
27-Feb-2006 |
yar |
CODA_COMPAT_5 may not be defined unconditionally in the coda5 module. Otherwise a kernel build would break in the coda5 module if the main kernel conf file enabled CODA_COMPAT_5, too. Redefined symbols are strictly disallowed by -Werror.
To overcome this issue, introduce a different symbol indicating coda5 build, CODA5_MODULE, and translate it to CODA_COMPAT_5 appropriately in /sys/coda/coda.h.
MFC after: 3 days
|
155922 |
22-Feb-2006 |
jhb |
Close some races between procfs/ptrace and exit(2): - Reorder the events in exit(2) slightly so that we trigger the S_EXIT stop event earlier. After we have signalled that, we set P_WEXIT and then wait for any processes with a hold on the vmspace via PHOLD to release it. PHOLD now KASSERT()'s that P_WEXIT is clear when it is invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops to zero. - Change proc_rwmem() to require that the processing read from has its vmspace held via PHOLD by the caller and get rid of all the junk to screw around with the vmspace reference count as we no longer need it. - In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it doesn't exist. - Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem() to clear an earlier single-step simualted via a breakpoint). We only do one to avoid races. Also, by making the EINVAL error for unknown requests be part of the default: case in the switch, the various switch cases can now just break out to return which removes a _lot_ of duplicated PRELE and proc unlocks, etc. Also, it fixes at least one bug where a LWP ptrace command could return EINVAL with the proc lock still held. - Changed the locking for ptrace_single_step(), ptrace_set_pc(), and ptrace_clear_single_step() to always be called with the proc lock held (it was a mixed bag previously). Alpha and arm have to drop the lock while the mess around with breakpoints, but other archs avoid extra lock release/acquires in ptrace(). I did have to fix a couple of other consumers in kern_kse and a few other places to hold the proc lock and PHOLD.
Tested by: ps (1 mostly, but some bits of 2-4 as well) MFC after: 1 week
|
155920 |
22-Feb-2006 |
jhb |
Change pfs_visible() to optionally return a pointer to the process associated with the passed in pfs_node. If it does return a pointer, it keeps the process locked. This allows a lot of places that were calling pfind() again right after pfs_visible() to not have to do that and avoids races since we don't drop the proc lock just to turn around and lock it again. This will become more important with future changes to fix races between procfs/ptrace and exit(2). Also, removed a duplicate pfs_visible() call in pfs_getextattr().
Reviewed by: des MFC after: 1 week
|
155918 |
22-Feb-2006 |
jhb |
Hold the proc lock while calling proc_sstep() since the function asserts it and remove a PRELE() that didn't have a matching PHOLD(). The calling code already has a PHOLD anyway.
MFC after: 1 week
|
155903 |
22-Feb-2006 |
jeff |
- We must hold a reference to a vnode before calling vgone() otherwise it may not be removed from the freelist.
MFC After: 1 week Found by: kris
|
155899 |
22-Feb-2006 |
jeff |
- spell VOP_LOCK(vp, LK_RELEASE... VOP_UNLOCK(vp,... so that asserts in vop_lock_post do not trigger. - Rearrange null_inactive to null_hashrem earlier so there is no chance of finding the null node on the hash list after the locks have been switched. - We should never have a NULL lowervp in null_reclaim() so there is no need to handle this situation. panic instead.
MFC After: 1 week
|
155898 |
22-Feb-2006 |
jeff |
- Assert that the lowervp is locked in null_hashget(). - Simplify the logic dealing with recycled vnodes in null_hashget() and null_hashins(). Since we hold the lower node locked in both cases the null node can not be undergoing recycling unless reclaim somehow called null_nodeget(). The logic that was in place was not safe and was essentially dead code.
MFC After: 1 week
|
155896 |
22-Feb-2006 |
jeff |
- Deadfs should not use the std GETWRITEMOUNT routine. Add one that always returns NULL.
MFC After: 1 week
|
155508 |
10-Feb-2006 |
jhb |
Correctly set MNTK_MPSAFE flag from the lower vnode's mount rather than always turning it on along with any flags set in the lower mount.
Tested by: kris Reviewed by: jeff MFC after: 3 days
|
155423 |
07-Feb-2006 |
jeff |
- No need to WANTPARENT when we're just going to vrele it in a deadlock prone way later.
Reported by: kkenn MFC After: 3 days
|
155256 |
03-Feb-2006 |
will |
Make UDF endian-safe.
Submitted by: Pedro Martelletto <pedro@ambientworks.net> (via scottl) Tested on: sparc64
|
155160 |
01-Feb-2006 |
jeff |
- Reorder calls to vrele() after calls to vput() when the vrele is a directory. vrele() may lock the passed vnode, which in these cases would give an invalid lock order of child -> parent. These situations are deadlock prone although do not typically deadlock because the vrele is typically not releasing the last reference to the vnode. Users of vrele must consider it as a call to vn_lock() and order it appropriately.
MFC After: 1 week Sponsored by: Isilon Systems, Inc. Tested by: kkenn
|
155034 |
30-Jan-2006 |
jeff |
- Remove a stale comment. This function was rewritten to be SMP safe some time ago.
Sponsored by: Isilon Systems, Inc.
|
154730 |
23-Jan-2006 |
trhodes |
Update incorrect comments here, there should not be a call to panic() over fs corruption.
Discussed with: alfred, phk
|
154692 |
22-Jan-2006 |
fjoe |
Do not assume that `char direntry::deExtension[3]' starts right after `char direntry::deName[8]' and access deExtension[] explicitly.
Found by: Coverity Prevent(tm) CID: 350, 351, 352
|
154647 |
21-Jan-2006 |
rwatson |
Convert last four functions in coda_vnops.c to ANSI C function declarations. I knew I would get to fix something in Coda eventually.
MFC after: 1 week
|
154487 |
17-Jan-2006 |
alfred |
I ran into an nfs client panic a couple of times in a row over the last few days. I tracked it down to the fact that nfs_reclaim() is setting vp->v_data to NULL _before_ calling vnode_destroy_object(). After silence from the mailing list I checked further and discovered that ufs_reclaim() is unique among FreeBSD filesystems for calling vnode_destroy_object() early, long before tossing v_data or much of anything else, for that matter. The rest, including NFS, appear to be identical, as if they were just clones of one original routine.
The enclosed patch fixes all file systems in essentially the same way, by moving the call to vnode_destroy_object() to early in the routine (before the call to vfs_hash_remove(), if any). I have only tested NFS, but I've now run for over eighteen hours with the patch where I wouldn't get past four or five without it.
Submitted by: Frank Mayhar Requested by: Mohan Srinivasan MFC After: 1 week
|
154152 |
09-Jan-2006 |
tegge |
Add marker vnodes to ensure that all vnodes associated with the mount point are iterated over when using MNT_VNODE_FOREACH.
Reviewed by: truckman
|
154144 |
09-Jan-2006 |
maxim |
o Fix typo in the define: s/MRAK_INT_GEN/MARK_INT_GEN/. The typo was harmless because the define is not used in coda_vfsops.c.
Submitted by: Hugo Meiland
|
154054 |
05-Jan-2006 |
maxim |
o Typo in the debug message: s/skiped/skipped.
PR: kern/91346 Submitted by: Gavin Atkinson
|
153986 |
03-Jan-2006 |
rwatson |
When returning EIO from DEVFSIO_RADD ioctl, drop the exclusive rule lock. Otherwise the system comes to a rather sudden and grinding halt.
MFC after: 1 week
|
153706 |
24-Dec-2005 |
trhodes |
Make tv_sec a time_t on all platforms but alpha. Brings us more in line with POSIX. This also makes the struct correct we ever implement an i386-time64 architecture. Not that we need too.
Reviewed by: imp, brooks Approved by: njl (acpica), des (no objects, touches procfs) Tested with: make universe
|
153400 |
14-Dec-2005 |
des |
Eradicate caddr_t from the VFS API.
|
153121 |
05-Dec-2005 |
avatar |
Recent nmount(2) adoption in mount_smbfs(8) did not flag the "long" option since mount_smbfs(8) assumed long name mounting by default unless "-n long" was explicitly specified.
Rather than supplying a "long" option in mount_smbfs(8), this commit brings back the original behaviour by associating SMBFS_MOUNT_NO_LONG with the "nolong" option. This should fix the broken long file names on smbfs people observed recently.
Reported by: Vladimir Grebenschikov <vova at fbsd dot ru> Reviewed by: phk Tested by: Slawa Olhovchenkov <slw at zxy dot spb dot ru>
|
153110 |
05-Dec-2005 |
ru |
Fix -Wundef warnings found when compiling i386 LINT, GENERIC and custom kernels.
|
153084 |
04-Dec-2005 |
ru |
Fix -Wundef from compiling the amd64 LINT.
|
153072 |
04-Dec-2005 |
ru |
Fix -Wundef.
|
152678 |
22-Nov-2005 |
bp |
Fix interaction with Windows 2000/XP based servers:
If the complete reply on the TRANS2_FIND_FIRST2 request fits exactly into one responce packet, then next call to TRANS2_FIND_NEXT2 will return zero entries and server will close current transaction. To avoid subsequent errors we should not perform FIND_CLOSE2 request.
PR: kern/78953 Submitted by: Jim Carroll
|
152610 |
19-Nov-2005 |
rodrigc |
Properly parse the nowin95 mount option.
Tested by: Rainer Hurling <rhurlin at gwdg dot de>
|
152595 |
18-Nov-2005 |
rodrigc |
Add "shortnames" and "longnames" mount options which are synonyms for "shortname" and "longname" mount options. The old (before nmount()) mount_msdosfs program accepted "shortnames" and "longnames", but the kernel nmount() checked for "shortname" and "longname". So, make the kernel accept "shortnames", "longnames", "shortname", "longname" for forwards and backwarsd compatibility.
Discovered by: Rainer Hurling <rhurlin at gwdg dot de>
|
152466 |
16-Nov-2005 |
rodrigc |
- Add errmsg to the list of smbfs mount options. - Use vfs_mount_error() to propagate smbfs mount errors back to userspace.
Reviewed by: bp (smbfs maintainer)
|
152254 |
09-Nov-2005 |
dwhite |
This is a workaround for a complicated issue involving VFS cookies and devfs. The PR and patch have the details. The ultimate fix requires architectural changes and clarifications to the VFS API, but this will prevent the system from panicking when someone does "ls /dev" while running in a shell under the linuxulator.
This issue affects HEAD and RELENG_6 only.
PR: 88249 Submitted by: "Devon H. O'Dell" <dodell@ixsystems.com> MFC after: 3 days
|
151897 |
31-Oct-2005 |
rwatson |
Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat.
- Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters.
- Disambiguate some collisions by adding subsystem prefixes to some memory types.
- Generally prefer lower case to upper case.
- If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases.
Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
|
151453 |
18-Oct-2005 |
phk |
Use correct cirteria for determining which directory entries we can purge right away and which we merely can hide.
Beaten into my skull by: kris
|
151447 |
18-Oct-2005 |
des |
Implement the full range of ISO9660 number conversion routines in iso.h.
MFC after: 2 weeks
|
151407 |
17-Oct-2005 |
rodrigc |
Unconditionally mount a CD9660 filesystem as read-only, instead of returning EROFS if we forget to mount it as read-only.
|
151406 |
17-Oct-2005 |
rodrigc |
Use the actual sector size of the media instead of hard-coding it to 2048. This eliminates KASSERTs in GEOM if we accidentally mount an audio CD as a cd9660 filesystem.
|
151405 |
17-Oct-2005 |
rodrigc |
Unconditionally mount a UDF filesystem as read-only, instead of returning an EROFS if we forget to mount it as read-only.
|
151396 |
17-Oct-2005 |
flz |
- Fix typo.
Approved by: ssouhlal MFC after: 1 week
|
151394 |
16-Oct-2005 |
truckman |
Update nwfs_lookup() to match the current cache_lookup() API. cache_lookup() has returned a ref'ed and locked vnode since vfs_cache.c:1.96, dated Tue Mar 29 12:59:06 2005 UTC. This change is similar to the change made to smbfs_lookup() in smbfs_vnops.c:1.58.
Tested by: "Antony Mawer" ant AT mawer.org MFC after: 2 weeks
|
151393 |
16-Oct-2005 |
kris |
Reflect mpsafety of the underlying filesystem in the nullfs image.
I benchmarked this by simultaneously extracting 4 large tarballs (basically world images) on a 4-processor AMD64 system, in a malloc-backed md.
With this patch, system time was reduced by 43%, and wall clock time by 33%.
Submitted by: jeff MFC after: 1 week
|
151392 |
16-Oct-2005 |
truckman |
Apply the same fix to a potential race in the ISDOTDOT code in cd9660_lookup() that was used to fix an actual race in ufs_lookup.c:1.78. This is not currently a hazard, but the bug would be activated by marking cd9660 as MPSAFE.
Requested by: bde
|
151349 |
14-Oct-2005 |
yar |
In preparation for making the modules actually use opt_*.h files provided in the kernel build directory, fix modules that were failing to build this way due to not quite correct kernel option usage. In particular:
ng_mppc.c uses two complementary options, both of which are listed in sys/conf/files. Ideally, there should be a separate option for including ng_mppc.c in kernel build, but now only NETGRAPH_MPPC_ENCRYPTION is usable anyway, the other one requires proprietary files.
nwfs and smbfs were trying to ensure they were built with proper network components, but the check was rather questionable.
Discussed with: ru
|
151316 |
14-Oct-2005 |
davidxu |
1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code.
2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread.
3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc.
4. Add td_sigqueue to thread structure to hold all signals delivered to thread.
5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed.
6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals.
Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64
|
151157 |
09-Oct-2005 |
rodrigc |
- Do not hardcode the bsize to a sectorsize of 2048, even though the UDF specification specifies a logical sectorsize of 2048. Instead, get it from GEOM. - When reading the UDF Anchor Volume Descriptor, use the logical sectorsize of 2048 when calculating the offset to read from, but use the actual sectorsize to determine how much to read.
- works with reading a DVD disk and a DVD disk image file via mdconfig - correctly returns EINVAL if we try to mount_udf an audio CD, instead of panicking inside GEOM when INVARIANTS is set
|
151054 |
07-Oct-2005 |
pjd |
We don't need 'imp' here.
|
150794 |
01-Oct-2005 |
rwatson |
Second attempt at a work-around for fifo-related socket panics during make -j with high levels of parallelism: acquire Giant in fifo I/O routines.
Discussed with: ups MFC after: 3 days
|
150761 |
30-Sep-2005 |
phk |
The NWFS code in RELENG_6 is broken due to a typo in sys/fs/nwfs/nwfs_vfsop= s.c, introduced with the conversion to nmount with revision 1.38. This causes mount_nwfs to fail with the error message:
mount_nwfs: mount error: /mnt/netware: syserr = No such file or directo= ry
This is caused by a typo on line 178, which specifies "nwfw_args" rather than "nwfs_args".
Submitted by: Antony Mawer <gnats@mawer.org> Fat fingers: phk PR: 86757 MFC: 3 days
|
150711 |
29-Sep-2005 |
peadar |
Remove checks for BOOTSIG[23] from FAT32 bootblocks.
There seems to be very little documentary evidence outside this implementation to suggest a these checks are neccessary, and more than one camera-formatted flash disk fails the check, but mounts successfully on most other systems.
Reviewed By: bde@
|
150623 |
27-Sep-2005 |
rwatson |
Back out fifo_vnops.c:1.127, which introduced an sx lock around I/O on a fifo. While this did indeed close the race, confirming suspicions about the nature of the problem, it causes difficulties with blocking I/O on fifos.
Discussed with: ups Also spotted by: Peter Holm <peter at holm dot cc>
|
150561 |
26-Sep-2005 |
rwatson |
Assert v_fifoinfo is non-NULL in fifo_close() in order to catch non-conforming cases sooner.
MFC after: 3 days Reported by: Peter Holm <peter at holm dot cc>
|
150545 |
25-Sep-2005 |
rwatson |
Lock the read socket receive buffer when frobbing the sb_state flag on that socket during open, not the write socket receive buffer. This might explain clearing of the sb_state SB_LOCK flag seen occasionally in soreceive() on fifos.
MFC after: 3 days Spotted by: ups
|
150501 |
24-Sep-2005 |
phk |
Make rule zero really magical, that way we don't have to do anything when we mount and get zero cost if no rules are used in a mountpoint.
Add code to deref rules on unmount.
Switch from SLIST to TAILQ.
Drop SYSINIT, use SX_SYSINIT and static initializer of TAILQ instead.
Drop goto, a break will do.
Reduce double pointers to single pointers.
Combine reaping and destroying rulesets.
Avoid memory leaks in a some error cases.
|
150486 |
23-Sep-2005 |
rwatson |
For reasons of consistency (and necessity), assert an exclusive vnode lock on the fifo vnode in fifo_open(): we rely on the vnode lock to serialize access to v_fifoinfo.
MFC after: 3 days
|
150462 |
22-Sep-2005 |
rwatson |
Add fi_sx, an sx lock to serialize I/O operations on the socket pair underlying the POSIX fifo implementation. In 6.x/7.x, fifo access is moved from the VFS layer, where it was serialized using the vnode lock, to the file descriptor layer, where access is protected by a reference count but not serialized. This exposed socket buffer locking to high levels of parallelism in specific fifo workloads, such as make -j 32, which expose as yet unresolved socket buffer bugs.
fi_sx re-adds serialization about the read and write routines, although not paths that simply test socket buffer mbuf queue state, such as the poll and kqueue methods. This restores the extra locking cost previously present in some cases, but is an effective workaround for the instability that has been experienced. This workaround should be removed once the bug in socket buffer handling has been fixed.
Reported by: kris, jhb, Julien Gabel <jpeg at thilelli dot net>, Peter Holm <peter at holm dot cc>, others MFC after: 3 days
|
150342 |
19-Sep-2005 |
phk |
Rewamp DEVFS internals pretty severely [1].
Give DEVFS a proper inode called struct cdev_priv. It is important to keep in mind that this "inode" is shared between all DEVFS mountpoints, therefore it is protected by the global device mutex.
Link the cdev_priv's into a list, protected by the global device mutex. Keep track of each cdev_priv's state with a flag bit and of references from mountpoints with a dedicated usecount.
Reap the benefits of much improved kernel memory allocator and the generally better defined device driver APIs to get rid of the tables of pointers + serial numbers, their overflow tables, the atomics to muck about in them and all the trouble that resulted in.
This makes RAM the only limit on how many devices we can have.
The cdev_priv is actually a super struct containing the normal cdev as the "public" part, and therefore allocation and freeing has moved to devfs_devs.c from kern_conf.c.
The overall responsibility is (to be) split such that kern/kern_conf.c is the stuff that deals with drivers and struct cdev and fs/devfs handles filesystems and struct cdev_priv and their private liason exposed only in devfs_int.h.
Move the inode number from cdev to cdev_priv and allocate inode numbers properly with unr. Local dirents in the mountpoints (directories, symlinks) allocate inodes from the same pool to guarantee against overlaps.
Various other fields are going to migrate from cdev to cdev_priv in the future in order to hide them. A few fields may migrate from devfs_dirent to cdev_priv as well.
Protect the DEVFS mountpoint with an sx lock instead of lockmgr, this lock also protects the directory tree of the mountpoint.
Give each mountpoint a unique integer index, allocated with unr. Use it into an array of devfs_dirent pointers in each cdev_priv. Initially the array points to a single element also inside cdev_priv, but as more devfs instances are mounted, the array is extended with malloc(9) as necessary when the filesystem populates its directory tree.
Retire the cdev alias lists, the cdev_priv now know about all the relevant devfs_dirents (and their vnodes) and devfs_revoke() will pick them up from there. We still spelunk into other mountpoints and fondle their data without 100% good locking. It may make better sense to vector the revoke event into the tty code and there do a destroy_dev/make_dev on the tty's devices, but that's for further study.
Lots of shuffling of stuff and churn of bits for no good reason[2].
XXX: There is still nothing preventing the dev_clone EVENTHANDLER from being invoked at the same time in two devfs mountpoints. It is not obvious what the best course of action is here.
XXX: comment out an if statement that lost its body, until I can find out what should go there so it doesn't do damage in the meantime.
XXX: Leave in a few extra malloc types and KASSERTS to help track down any remaining issues.
Much testing provided by: Kris Much confusion caused by (races in): md(4)
[1] You are not supposed to understand anything past this point.
[2] This line should simplify life for the peanut gallery.
|
150281 |
18-Sep-2005 |
rwatson |
Assert that (vp) is locked in fifo_close(), since we rely on the exclusive vnode lock to synchronize the reference counts on struct fifoinfo.
MFC after: 3 days
|
150200 |
15-Sep-2005 |
phk |
Don't attempt to recurse lockmgr, it doesn't like it.
|
150181 |
15-Sep-2005 |
kan |
Handle a race condition where NULLFS vnode can be cleaned while threads can still be asleep waiting for lowervp lock.
Tested by: kkenn Discussed with: ssouhlal, jeffr
|
150165 |
15-Sep-2005 |
rwatson |
The socket pointers in fifoinfo are not permitted to be NULL, so don't check if they are, it just confuses the fifo code more.
MFC after: 3 days
|
150151 |
15-Sep-2005 |
phk |
Various minor polishing.
|
150150 |
15-Sep-2005 |
phk |
Protect the devfs rule internal global lists with a sx lock, the per mount locks are not enough. Finer granularity (x)locking could be implemented, but I prefer to keep it simple for now.
|
150149 |
15-Sep-2005 |
phk |
Absolve devfs_rule.c from locking responsibility and call it with all necessary locking held.
|
150147 |
15-Sep-2005 |
phk |
Close a race which could result in unwarranted "ruleset %d already running" panics.
Previously, recursion through the "include" feature was prevented by marking each ruleset as "running" when applied. This doesn't work for the case where two DEVFS instances try to apply the same ruleset at the same time.
Instead introduce the sysctl vfs.devfs.rule_depth (default == 1) which limits how many levels of "include" we will traverse.
Be aware that traversal of "include" is recursive and kernel stack size is limited.
MFC: after 3 days
|
150096 |
13-Sep-2005 |
rwatson |
Trim down now (believed to be) unused fifo_ioctl() and fifo_kqfilter() VOP implementations, since they in theory are used only on open file descriptors, in which case the ioctls are via fifo_ioctl_f() and kqueue requests are via fifo_kqfilter_f(). Generate warnings if they are entered for now. These printf() calls should become panic() calls.
Annotate and re-implement fifo_ioctl_f(): don't arbitrarily forward ioctls to the socket layer, only forward the ones we explicitly support for fifos. In the case of FIONREAD, don't forward the request to the write socket on a read-write fifo, or the read result is overwritten. Annotate a nasty case for the undefined POSIX O_RDWR on fifos, in which failure of the second ioctl will result in the socket pair being in an inconsistent state.
Assert copyright as I find myself rewriting non-trivial parts of fifofs.
MFC after: 3 days
|
150077 |
13-Sep-2005 |
rwatson |
As a result of kqueue locking work, socket buffer locks will always be held when entering a kqueue filter for fifos via a socket buffer event: as such, assert the lock unconditionally rather than acquiring it conditionall.
MFC after: 3 days
|
150074 |
13-Sep-2005 |
rwatson |
Annotate two issues:
1) fifo_kqfilter() is not actually ever used, it likely should be GC'd.
2) fifo_kqfilter_f() doesn't implement EVFILT_VNODE, so detecting events on the underlying vnode for a fifo no longer works (it did in 4.x). Likely, fifo_kqfilter_f() should forward the request to the VFS using fp->f_vnode, which would work once fifo_kqfilter() was detached from the vnode operation vector (removing the fifo override).
Discussed with: phk
|
150066 |
12-Sep-2005 |
rwatson |
Introduce no-op nosup fifo kqueue filter and detach routine, which are used when a read filter is requested on a write-only fifo descriptor, or a write filter is requested on a read-only fifo descriptor. This permits the filters to be registered, but never raises the event, which causes kqueue behavior for fifos to more closely match similar semantics for poll and select, which permit testing for the condition even though the condition will never be raised, and is consistent with POSIX's notion that a fifo has identical semantics to a one-way IPC channel created using pipe() on most operating systems.
The fifo regression test suite can now run to completion on HEAD without errors.
MFC after: 3 days
|
150060 |
12-Sep-2005 |
rwatson |
When a request is made to register a filter on a fifo that doesn't apply to the fifo (i.e., not EVFILT_READ or EVFILT_WRITE), reject it as EINVAL, not by returning 1 (EPERM).
MFC after: 3 days
|
150033 |
12-Sep-2005 |
rwatson |
Remove DFLAG_SEEKABLE from fifo file descriptors: fifos are not seekable according to POSIX, not to mention the fact that it doesn't make sense (and hence isn't really implemented). This causes the fifo_misc regression test to succeed.
|
150027 |
12-Sep-2005 |
rwatson |
Only poll the fifo for read events if the fifo is attached to a readable file descriptor. Otherwise, the read end of a fifo might return that it is writable (which it isn't).
Only poll the fifo for write events if the fifo attached to a writable file descriptor. Otherwise, the write end of a fifo might return that it is readable (which it isn't).
In the event that a file is FREAD|FWRITE (which is allowed by POSIX, but has undefined behavior), we poll for both.
MFC after: 3 days
|
150026 |
12-Sep-2005 |
rwatson |
After going to some trouble to identify only the write-related events to poll the write socket for, the fifo polling code proceeded to poll for the complete set of events. Use 'levents' instead of 'events' as the argument to poll, and only poll the write socket if there is interest in write events.
MFC after: 3 days
|
150025 |
12-Sep-2005 |
rwatson |
When a writer opens a fifo, wake up the read socket for read, not the write socket.
MFC after: 3 days
|
150024 |
12-Sep-2005 |
rwatson |
Add an assertion that fifo_open() doesn't race against other threads while sleeping to allocate fifo state: due to using the vnode lock to serialize access to a fifo during open, it shouldn't happen (tm).
MFC after: 3 days
|
150023 |
12-Sep-2005 |
rwatson |
Rather than reaching into the internals of the UNIX domain socket code by calling uipc_connect2() to connect two socket endpoints to create a fifo, call soconnect2().
MFC after: 3 days
|
150019 |
12-Sep-2005 |
phk |
Clean up prototypes.
|
149991 |
11-Sep-2005 |
rodrigc |
Cast bf_sysid to const char * when passing it to strncmp(), because strncmp does not take an unsigned char *. Eliminates warning with GCC 4.0.
|
149990 |
11-Sep-2005 |
rodrigc |
Do not declare M_NTFSMNT with extern linkage here, since it is defined with static linkage in ntfs_vfsops.c. Fixes compilation with GCC 4.0.
|
149850 |
07-Sep-2005 |
obrien |
Ensure the full value is written into inode variables.
PR: 85503 Submitted by: Dmitry Pryanishnikov <dmitry@atlantis.dp.ua>
|
149771 |
03-Sep-2005 |
ssouhlal |
Unbreak hpfs/ntfs/udf/ext2fs/reiserfs mounting.
Another pointyhat to: ssouhlal
|
149745 |
03-Sep-2005 |
ssouhlal |
Unbreak the build.
Pointyhat to: ssouhlal
|
149722 |
02-Sep-2005 |
ssouhlal |
Use vput() instead of vrele() in null_reclaim() since the lower vnode is locked.
MFC after: 3 days
|
149720 |
02-Sep-2005 |
ssouhlal |
*_mountfs() (if the filesystem mounts from a device) needs devvp to be locked, so lock it.
Glanced at by: phk MFC after: 3 days
|
149573 |
29-Aug-2005 |
phk |
Add a missing dev_relthread() call.
Remove unused variable.
Spotted by: Hans Petter Selasky <hselasky@c2i.net>
|
149177 |
17-Aug-2005 |
phk |
Handle device drivers with D_NEEDGIANT in a way which does not penalize the 'good' drivers: Allocate a shadow cdevsw and populate it with wrapper functions which grab Giant
|
149146 |
16-Aug-2005 |
phk |
Collect the devfs related sysctls in one place
|
149144 |
16-Aug-2005 |
phk |
Create a new internal .h file to communicate very private stuff from kern_conf.c to devfs.
For now just two prototypes, more to come.
|
149107 |
15-Aug-2005 |
phk |
Eliminate effectively unused dm_basedir field from devfs_mount.
|
149045 |
14-Aug-2005 |
grehan |
- restore the ability to mount cd9660 filesystems as root by inverting some of the options test, specifically the joliet and rockridge tests. Since the root mount callchain doesn't go through cd9660_cmount, the default mount options aren't set. Rather than having the main codepath assume the options are there, test for the absence of the inverted optioin
e.g. instead of vfs_flagopt(.. "joliet" ..), test for !vfs_flagopt(.. "nojoliet" ..)
This works for root mount, non-root mount and future nmount cases.
- in cd9660_cmount, remove inadvertent setting of "gens" when "extatt" was set.
Reported by: grehan, Dario Freni <saturnero at freesbie org> Tested by: Dario Freni Not objected to by: phk
MFC after: 3 days
|
148984 |
12-Aug-2005 |
des |
Eliminate an unnecessary bcopy().
|
148920 |
10-Aug-2005 |
obrien |
Remove public declarations of variables that were forgotten when they were made static.
|
148919 |
10-Aug-2005 |
obrien |
Remove the need to forward declare statics by moving them around.
|
148868 |
08-Aug-2005 |
rwatson |
Merge the dev_clone and dev_clone_cred event handlers into a single event handler, dev_clone, which accepts a credential argument. Implementors of the event can ignore it if they're not interested, and most do. This avoids having multiple event handler types and fall-back/precedence logic in devfs.
This changes the kernel API for /dev cloning, and may affect third party packages containg cloning kernel modules.
Requested by: phk MFC after: 3 days
|
148547 |
29-Jul-2005 |
kris |
devfs is not yet fully MPSAFE - for example, multiple concurrent devfs(8) processes can cause a panic when operating on rulesets.
Approved by: phk
|
148182 |
20-Jul-2005 |
simon |
Correct devfs ruleset bypass.
Submitted by: csjp Reviewed by: phk Security: FreeBSD-SA-05:17.devfs Approved by: cperciva
|
148089 |
17-Jul-2005 |
imura |
[1] unix2doschr() If a character cannot be converted to DOS code page, unix2doschr() returned `0'. As a result, unix2dosfn() was forced to return `0', so we saw a file which was composed of these characters as `Invalid argument'. To correct this, if a character can be converted to Unicode, unix2doschr() now returns `1' which is a magic number to make unix2dosfn() know that the character must be converted to `_'.
[2] unix2dosfn() The above-mentioned solution only works if a file has both of Unicode name and DOS code page name. Unicode name would not be recorded if file name can be settled within 11 bytes (DOS short name) and if no conversion from Unix charset to DOS code page has occurred. Thus, FreeBSD can create a file which has only short name, but there is no guarantee that the short name contains allways valid characters because we leave it to people by using mount_msdosfs(8) to select which conversion is used between DOS code page and unix charset. To avoid this, Unicode file name should be recorded unless a character is an ascii character. This is the way Windows XP do.
PR: 77074 [1] MFC after: 1 week
|
147982 |
14-Jul-2005 |
rwatson |
When devfs cloning takes place, provide access to the credential of the process that caused the clone event to take place for the device driver creating the device. This allows cloned device drivers to adapt the device node based on security aspects of the process, such as the uid, gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is instantiated as a vnode, the cloning credential can be exposed to MAC.
- Add make_dev_cred(), a version of make_dev() that additionally accepts the credential to stick in the struct cdev. Implement it and make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an optional credential pointer (may be NULL), so that MAC policies can inspect and act on the label or other elements of the credential when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(), so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also a prerequisite for changes to allow ptys to be instantiated with the UID of the process looking up the pty. This requires further changes to the pty driver -- in particular, to immediately recycle pty nodes on last close so that the credential-related state can be recreated on next lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com> Obtained from: TrustedBSD Project Sponsored by: SPAWAR, SPARTA MFC after: 1 week MFC note: Merge to 6.x, but not 5.x for ABI reasons
|
147857 |
09-Jul-2005 |
tanimura |
Regrab dvp only when ISDOTDOT.
Approved by: re (scottl)
|
147809 |
07-Jul-2005 |
jeff |
- Since we don't hold a usecount in pfs_exit we have to get a holdcnt prior to calling vgone() to prevent any races.
Sponsored by: Isilon Systems, Inc. Approved by: re (vfs blanket)
|
147692 |
30-Jun-2005 |
peter |
Jumbo-commit to enhance 32 bit application support on 64 bit kernels. This is good enough to be able to run a RELENG_4 gdb binary against a RELENG_4 application, along with various other tools (eg: 4.x gcore). We use this at work.
ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace, procfs and core dumps. procfs_*regs.c: vary the format of proc/XXX/*regs depending on the client and target application. procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their sscanf fails. They expect an unsigned long. imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps. sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note that 64 bit consumers can still debug 32 bit targets.
IA64 has got stubs for ia32_reg.c.
Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't implemented in the 32/64 wrapper yet. We also make a tiny patch to gdb pacify it over conflicting formats of ld-elf.so.1.
Approved by: re
|
147676 |
30-Jun-2005 |
peter |
Conditionally weaken sys_generic.c rev 1.136 to allow certain dubious ioctl numbers in backwards compatability mode. eg: an IOC_IN ioctl with a size of zero. Traditionally this was what you did before IOC_VOID existed, and we had some established users of this in the tree, namely procfs. Certain 3rd party drivers with binary userland components also have this too.
This is necessary to have 4.x and 5.x binaries use these ioctl's. We found this at work when trying to run 4.x binaries.
Approved by: re
|
146984 |
05-Jun-2005 |
imura |
Avoid casting from (int *) to (size_t *) in order to fix udf_iconv on amd64.
Reviewed by: scottl MFC after: 2 weeks
|
146823 |
31-May-2005 |
rodrigc |
Do not declare a struct as extern, and then implement it as static in the same file. This is not legal C, and GCC 4.0 will issue an error.
Reviewed by: phk Approved by: das (mentor)
|
146121 |
11-May-2005 |
brueffer |
Fix three typos in comments. Two of them obtained from OpenBSD.
MFC after: 3 days
|
146115 |
11-May-2005 |
kan |
Do not dereference dvp pointer before doing a NULL check.
Noticed by: Coverity Prevent analysis tool.
|
145974 |
06-May-2005 |
anholt |
Staticize a symbol used only in this file.
PR: kern/43613 Submitted by: Matt Emmerton, matt at gsicomp dot on dot ca
|
145939 |
06-May-2005 |
robert |
The printf(9) `%p' conversion specifier puts an "0x" in front of the pointer value. Therefore, remove the "0x" from the format string.
|
145938 |
06-May-2005 |
robert |
Fix our NTFS readdir function.
To check a directory's in-use bitmap bit by bit, we use a pointer to an 8 bit wide unsigned value.
The index used to dereference this pointer is calculated by shifting the bit index right 3 bits. Then we do a logical AND with the bit# represented by the lower 3 bits of the bit index.
This is an idiomatic way of iterating through a bit map with simple bitwise operations.
This commit fixes the bug that we only checked bits 3:0 of each 8 bit chunk, because we only used bits 1:0 of the bit index for the bit# in the current 8 bit value. This resulted in files not being returned by getdirentries(2).
Change the type of the bit map pointer from `char *' to `u_int8_t *'.
|
145900 |
05-May-2005 |
takawata |
Fix breakage on alpha.
Pointed out by: hrs via IRC
|
145872 |
04-May-2005 |
takawata |
Make smbfs capable to use 16bit char set in filenames.
PR:78110
|
145825 |
03-May-2005 |
jeff |
- Set the v_object pointer after a successful VOP_OPEN(). This isn't a perfect solution as the lower vm object can change at unpredictable times if our lower vp happens to be on another unionfs, etc.
Submitted by: Oleg Sharoiko <os@rsu.ru>
|
145730 |
01-May-2005 |
jeff |
- In devfs_open() and devfs_close() grab Giant if the driver sets NEEDGIANT. We still have to DROP_GIANT and PICKUP_GIANT when NEEDGIANT is not set because vfs is still sometime entered with Giant held.
|
145714 |
30-Apr-2005 |
des |
Fix an old pasto.
|
145698 |
30-Apr-2005 |
jeff |
- Mark devfs as MNTK_MPSAFE as I belive it does not require Giant.
Sponsored by: Isilon Systems, Inc. Agreed in principle by: phk
|
145586 |
27-Apr-2005 |
jeff |
- Fix several locking problems in unionfs_mount so that it will come closer to passing DEBUG_VFS_LOCKS.
|
145585 |
27-Apr-2005 |
jeff |
- Pass the ISOPEN flag down to our lower filesystems. - Remove an erroneous VOP lock assert.
|
145424 |
22-Apr-2005 |
jeff |
- As this is presently the one and only place where duplicate acquires of the vnode interlock are allowed mark it by passing MTX_DUPOK to this lock operation only.
Sponsored by: Isilon Systems, Inc.
|
145174 |
16-Apr-2005 |
das |
Disable negative name caching for msdosfs to work around a bug. Since the name cache is case-sensitive and msdosfs isn't, creating a file 'foo' won't invalidate a negative entry for 'FOO'. There are similar problems related to 8.3 filenames.
A better solution is to override VOP_LOOKUP with a method that canonicalizes the name, then calls vfs_cache_lookup(). Unfortunately, it's not quite that simple because vfs_cache_lookup() will call msdosfs_lookup() on a cache miss, and msdosfs_lookup() needs a way to get at the original component name.
|
145131 |
16-Apr-2005 |
njl |
Fix mbnambuf support for multi-byte characters. If a substring is larger than WIN_CHARS bytes, we shift the suffix (previous substrings) upwards by the amount this substring exceeds its WIN_CHARS slot. Profiling shows this change is indistinguishable from the previous code at 95% confidence. This bug would result in attempts to access or create files or directories with multi-byte characters returning an error but no data loss.
Reported and tested by: avatar MFC after: 3 days
|
145072 |
14-Apr-2005 |
brueffer |
Correct typo.
Obtained from: OpenBSD
|
145006 |
13-Apr-2005 |
jeff |
- Change all filesystems and vfs_cache to relock the dvp once the child is locked in the ISDOTDOT case. Se vfs_lookup.c r1.79 for details.
Sponsored by: Isilon Systems, Inc.
|
144904 |
11-Apr-2005 |
jeff |
- Clear VI_OWEINACT before calling vget() with no lock type. We know the node is actually already locked, and VOP_INACTIVE is not desirable in this case.
|
144903 |
11-Apr-2005 |
jeff |
- Honor the flags argument passed to null_root(). The filesystem below us will decide whether or not to grab a real shared lock.
|
144852 |
10-Apr-2005 |
delphij |
Initialize vp before using it. Failing to do this can cause instant panic when trying to access a file on mounted smbfs.
Submitted by: takawata at jp freebsd org
|
144740 |
07-Apr-2005 |
phk |
Give msdosfs a unique inode number which is really the byteoffset of the directory entry.
This solves the corruption problem I belive.
Regression test script by: silby
|
144620 |
04-Apr-2005 |
jeff |
- Fix union's assumptions about when the dvp is unlocked. It is only unlocked in the ISDOTDOT case now, not for all !ISLASTCN lookups.
|
144389 |
31-Mar-2005 |
phk |
Explicitly hold a reference to the cdev we have just cloned. This closes the race where the cdev was reclaimed before it ever made it back to devfs lookup.
|
144385 |
31-Mar-2005 |
phk |
cdev (still) needs per instance uid/gid/mode
Add unlocked version of dev_ref()
Clean up various stuff in sys/conf.h
|
144384 |
31-Mar-2005 |
phk |
Rename dev_ref() to dev_refl()
|
144366 |
31-Mar-2005 |
jeff |
- LK_NOPAUSE is a nop now.
Sponsored by: Isilon Systems, Inc.
|
144299 |
29-Mar-2005 |
jeff |
- Remove wantparent, it is no longer necessary. An assert in vfs_lookup.c prevents any callers from doing a modifying op without LOCKPARENT or WANTPARENT.
|
144298 |
29-Mar-2005 |
jeff |
- Remove wantparent, it is no longer necessary. An assert in vfs_lookup.c prevents any callers from doing a DELETE or RENAME without locking the parent.
|
144297 |
29-Mar-2005 |
jeff |
- cache_lookup() now locks the new vnode for us to prevent some races. Remove redundant code.
Sponsored by: Isilon Systems, Inc.
|
144230 |
28-Mar-2005 |
jeff |
- Correct the dprintf format int the _lookup routine.
Spotted by: pjd
|
144228 |
28-Mar-2005 |
jeff |
- Garbage collect an unused variable.
|
144227 |
28-Mar-2005 |
jeff |
- Don't panic if we can't lock a child in lookup, return an error instead. - Only unlock the directory if this is a DOTDOT lookup. Previously this code could have deadlocked if there was a DOTDOT lookup with LOCKPARENT set and another thread was locking the other way up the tree.
Sponsored by: Isilon Systems, Inc.
|
144225 |
28-Mar-2005 |
jeff |
- Remove unnecessary LOCKPARENT manipulation.
Sponsored by: Isilon Systems, Inc.
|
144215 |
28-Mar-2005 |
jeff |
- nwfs_lookup() is no longer responsible for unlocking the dvp, this is handled in vfs_lookup.c. This code was missing PDIRUNLOCK use prior to the removal of PDIRUNLOCK in rev 1.73 of vfs_lookup.c.
Sponsored by: Isilon Systems, Inc.
|
144213 |
28-Mar-2005 |
jeff |
- hpfs_lookup() is no longer responsible for unlocking the dvp, this is handled in vfs_lookup.c. This code was missing PDIRUNLOCK use prior to the removal of PDIRUNLOCK in rev 1.73 of vfs_lookup.c.
Sponsored by: Isilon Systems, Inc.
|
144208 |
28-Mar-2005 |
jeff |
- We no longer have to bother with PDIRUNLOCK, lookup() handles it for us.
Sponsored by: Isilon Systems, Inc.
|
144207 |
28-Mar-2005 |
jeff |
- We no longer have to bother with PDIRUNLOCK, lookup() handles it for us. - In the ISDOTDOT case we have to unlock the dvp before locking the child, if this fails we must relock dvp before returning an error. This was missing before.
Sponsored by: Isilon Systems, Inc.
|
144206 |
28-Mar-2005 |
jeff |
- We no longer have to bother with PDIRUNLOCK, lookup() handles it for us. - Network filesystems are written with a special idiom that checks the cache first, and may even unlock dvp before discovering that a network round-trip is required to resolve the name. I believe dvp is prevented from being recycled even in the forced unmount case by the shared lock on the mount point. If not, this code should grow checks for VI_DOOMED after it relocks dvp or it will access NULL v_data fields.
Sponsored by: Isilon Systems, Inc.
|
144103 |
25-Mar-2005 |
jeff |
- Pass LK_EXCLUSIVE as the lock type to vget in vfs_hash_insert().
|
144059 |
24-Mar-2005 |
jeff |
- Update vfs_root implementations to match the new prototype. None of these filesystems will support shared locks until they are explicitly modified to do so. Careful review must be done to ensure that this is safe for each individual filesystem.
Sponsored by: Isilon Systems, Inc.
|
144058 |
24-Mar-2005 |
jeff |
- Update vfs_root implementations to match the new prototype. None of these filesystems will support shared locks until they are explicitly modified to do so. Careful review must be done to ensure that this is safe for each individual filesystem.
Sponsored by: Isilon Systems, Inc.
|
143841 |
19-Mar-2005 |
phk |
Use subr_unit
|
143756 |
17-Mar-2005 |
phk |
Also remember to set the fsid here.
|
143755 |
17-Mar-2005 |
phk |
Forgot to replace code to set fsid in vop_getattr.
|
143746 |
17-Mar-2005 |
phk |
Prepare for the final onslaught on devices:
Move uid/gid/mode from cdev to cdevsw.
Add kind field to use for devd(8) later.
Bump both D_VERSION and __FreeBSD_version
|
143744 |
17-Mar-2005 |
jeff |
- Lock the clearing of v_data so it is safe to inspect it with the interlock.
Sponsored by: Isilon Systems, Inc.
|
143692 |
16-Mar-2005 |
phk |
Add two arguments to the vfs_hash() KPI so that filesystems which do not have unique hashes (NFS) can also use it.
|
143691 |
16-Mar-2005 |
phk |
Remove unused file
|
143686 |
16-Mar-2005 |
phk |
Remove inode fields previously used for private inode hash tables.
|
143679 |
16-Mar-2005 |
phk |
XXX: unnecessary pointer in inode.
|
143678 |
16-Mar-2005 |
phk |
Don't store the disk cdev in all inodes.
|
143668 |
15-Mar-2005 |
phk |
Don't hold a reference to the disk vnode for each inode.
Eliminate cdev and vnode pointer to the disk from the inodes, the mount holds everything we need.
|
143667 |
15-Mar-2005 |
phk |
Eliminate cdev pointer in inodes, they're not used or needed.
The cdev could have been pulled out of the mountpoint cheaper back when it was used anyway.
|
143666 |
15-Mar-2005 |
phk |
Don't hold a reference on the disk vnode for each inode.
|
143663 |
15-Mar-2005 |
phk |
Improve the vfs_hash() API: vput() the unneeded vnode centrally to avoid replicating the vput in all the filesystems.
|
143642 |
15-Mar-2005 |
jeff |
- Assume that all lower filesystems now support proper locking. Assert that they set v->v_vnlock. This is true for all filesystems in the tree. - Remove all uses of LK_THISLAYER. If the lower layer is locked, the null layer is locked. We only use vget() to get a reference now. null essentially does no locking. This fixes LOOKUP_SHARED with nullfs. - Remove the special LK_DRAIN considerations, I do not believe this is needed now as LK_DRAIN doesn't destroy the lower vnode's lock, and it's hardly used anymore. - Add one well commented hack to prevent the lowervp from going away while we're in it's VOP_LOCK routine. This can only happen if we're forcibly unmounted while some callers are waiting in the lock. In this case the lowervp could be recycled after we drop our last ref in null_reclaim(). Prevent this with a vhold().
|
143637 |
15-Mar-2005 |
phk |
Disable two users of findcdev. They do the wrong thing now and will need to be fixed. In both cases the API should be reengineered to do something (more) sensible.
|
143630 |
15-Mar-2005 |
jeff |
- We have to transfer lockers after reseting our vnlock pointer.
Sponsored by: Isilon Systems, Inc.
|
143629 |
15-Mar-2005 |
phk |
Don't export major,minor, instead export tty name.
|
143624 |
15-Mar-2005 |
phk |
Print devtoname() instead of minor().
|
143623 |
15-Mar-2005 |
phk |
Fix typo: pointers are not boolean in style(9).
|
143619 |
15-Mar-2005 |
phk |
Simplify the vfs_hash calling convention.
|
143597 |
14-Mar-2005 |
des |
Hook pfs_lookup() up to vfs_cachedlookup_desc instead of vfs_lookup_desc, as suggested by Matt's comment. Also fix some style and paranoia issues.
The entire function could benefit from review by a VFS guru.
MFC after: 6 weeks
|
143596 |
14-Mar-2005 |
des |
Fix two long-standing bugs in pfs_readdir():
Since we used an sbuf of size resid to accumulate dirents, we would end up returning one byte short when we had enough dirents to fill or exceed the size of the sbuf (the last byte being lost to bogus NUL termination) causing the next call to return EINVAL due to an unaligned offset. This went undetected for a long time because I did most of my testing in single-user mode, where there are rarely enough processes to fill the 4096-byte buffer ls(1) uses. The most common symptom of this bug is that tab completion of /proc or /compat/linux/proc does not work properly when many processes are running.
Also, a check near the top would return EINVAL if resid was smaller than PFS_DELEN, even if it was 0, which is frequently the case and perfectly allowable. Change the test so that it returns 0 if resid is 0.
MFC after: 2 weeks
|
143595 |
14-Mar-2005 |
des |
If PSEUDOFS_TRACE is defined, create a sysctl knob to enable / disable pseudofs call tracing.
|
143592 |
14-Mar-2005 |
des |
fbsdidize.
|
143588 |
14-Mar-2005 |
phk |
Use vfs_hash instead of home-rolled.
|
143577 |
14-Mar-2005 |
phk |
Use vfs_hash instead of home-rolled.
|
143571 |
14-Mar-2005 |
phk |
Use vfs_hash instead of home-rolled.
Correct locking around g_vfs_close()
|
143570 |
14-Mar-2005 |
phk |
Use vfs_hash instead of home-rolling.
|
143514 |
13-Mar-2005 |
jeff |
- VOP_INACTIVE should no longer drop the vnode lock.
Sponsored by: Isilon Systems, Inc.
|
143513 |
13-Mar-2005 |
jeff |
- The VI_DOOMED flag now signals the end of a vnode's relationship with the filesystem. Check that rather than VI_XLOCK. - VOP_INACTIVE should no longer drop the vnode lock. - The vnode lock is required around calls to vrecycle() and vgone().
Sponsored by: Isilon Systems, Inc.
|
143510 |
13-Mar-2005 |
jeff |
- The VI_DOOMED flag now signals the end of a vnode's relationship with the filesystem. Check that rather than VI_XLOCK.
Sponsored by: Isilon Systems, Inc.
|
143507 |
13-Mar-2005 |
jeff |
- The c_lock in the coda node does not offer any features over the standard vnode lock. Remove the c_lock and use the vn lock in its place. - Keep the coda lock functions so that the debugging information is preserved, but call directly to the vop_std*lock routines for the real functionality.
Sponsored by: Isilon Systems, Inc.
|
143506 |
13-Mar-2005 |
jeff |
- Deadfs may now use the standard vop lock, get rid of dead_lock(). - We no longer have to take the XLOCK state into consideration in any routines.
Sponsored by: Isilon Systems, Inc.
|
143446 |
12-Mar-2005 |
obrien |
Used unsigned version.
Submitted by: jmallett
|
143444 |
12-Mar-2005 |
obrien |
Fix kernel build on 64-bit machines.
|
143436 |
11-Mar-2005 |
njl |
Correct a last-minute thinko. Instead of copying the nul with the string, nul-terminate the dp->d_name directly and only copy the string.
|
143435 |
11-Mar-2005 |
njl |
The mbnambuf routines combine multiple substrings into a single long filename. Each substring is indexed by the windows ID, a sequential one-based value. The previous code was extremely slow, doing a malloc/strcpy/free for each substring.
This code optimizes these routines with this in mind, using the ID to index into a single array and concatenating each WIN_CHARS chunk at once. (The last chunk is variable-length.)
This code has been tested as working on an FS with difficult filename sizes (255, 13, 26, etc.) It gives a 77.1% decrease in profiled time (total across all functions) and a 73.7% decrease in wall time. Test was "ls -laR > /dev/null".
Per-function time savings: mbnambuf_init: -90.7% mbnambuf_write: -18.7% mbnambuf_flush: -67.1%
MFC after: 1 month
|
143383 |
10-Mar-2005 |
phk |
One more bit of the major/minor patch to make ttyname happy as well.
|
143381 |
10-Mar-2005 |
phk |
Try to fix the mess I made of devname, with the minimal subset of the larger minor/major patch which was posted for testing.
|
143303 |
08-Mar-2005 |
phk |
Remove kernelside support for devfs rules filtering on major numbers.
|
142907 |
01-Mar-2005 |
phk |
Avoid a couple of mutex operations in the process exit path for the common case where procfs have never been mounted.
OK'ed by: des
|
142692 |
27-Feb-2005 |
phk |
Remove debug printout of major/minor numbers, print name instead.
|
142255 |
22-Feb-2005 |
sam |
remove dead code
Submitted by: Coverity Prevent analysis tool
|
142250 |
22-Feb-2005 |
phk |
We may not have an actual cdev at this point.
|
142242 |
22-Feb-2005 |
phk |
Reap more benefits from DEVFS:
List devfs_dirents rather than vnodes off their shared struct cdev, this saves a pointer field in the vnode at the expense of a field in the devfs_dirent. There are often 100 times more vnodes so this is bargain. In addition it makes it harder for people to try to do stypid things like "finding the vnode from cdev".
Since DEVFS handles all VCHR nodes now, we can do the vnode related cleanup in devfs_reclaim() instead of in dev_rel() and vgonel(). Similarly, we can do the struct cdev related cleanup in dev_rel() instead of devfs_reclaim().
rename idestroy_dev() to destroy_devl() for consistency.
Add LIST_ENTRY de_alias to struct devfs_dirent. Remove v_specnext from struct vnode. Change si_hlist to si_alist in struct cdev. String new devfs vnodes' devfs_dirent on si_alist when we create them and take them off in devfs_reclaim().
Fix devfs_revoke() accordingly. Also don't clear fields devfs_reclaim() will clear when called from vgone();
Let devfs_reclaim() call dev_rel() instead of vgonel().
Move the usecount tracking from dev_rel() to devfs_reclaim(), and let dev_rel() take a struct cdev argument instead of vnode.
Destroy SI_CHEAPCLONE devices in dev_rel() (instead of devfs_reclaim()) when they are no longer used. (This should maybe happen in devfs_close() instead.)
|
142238 |
22-Feb-2005 |
phk |
vp->v_id is a private field for the vfs namecache and it is a big mistake that NFS ever started using it and an even bigger that it got copied&pasted to nwfs and smbfs.
Replace with use of vhold()/vdrop().
|
142235 |
22-Feb-2005 |
phk |
Use vn_printf() instead of home-rolling.
|
142232 |
22-Feb-2005 |
phk |
Make dev_ref() require the dev_lock() to be held and use it from devfs instead of directly frobbing the si_refcount.
|
142152 |
20-Feb-2005 |
das |
Replace the workaround for a deadlock bug in Coda with a different workaround that does not rely on vfs_start().
|
142043 |
18-Feb-2005 |
rwatson |
Remove basically unused root_vp pointer in udfmount.
MFC after: 1 week Discussed with: scottl
|
142040 |
18-Feb-2005 |
rwatson |
Conditionalize cd9660 chattiness regarding the nature of the file system mounted (is it Joliet, RockRidge, High Sierra) based on bootverbose. Most file systems don't generate log messages based on details of the file system superblock, and these log messages disrupt sysinstall output during a new install from CD. We may want to explore exposing this status information using nmount() at some point.
MFC after: 3 days
|
142011 |
17-Feb-2005 |
phk |
Introduce vx_wait{l}() and use it instead of home-rolled versions.
|
141633 |
10-Feb-2005 |
phk |
Make a SYSCTL_NODE static
|
141623 |
10-Feb-2005 |
phk |
make M_NTFSMNT and ntfs_calccfree() static
|
141622 |
10-Feb-2005 |
phk |
Make fdesc_root static
|
141620 |
10-Feb-2005 |
phk |
Make smbfs_debuglevel private.
|
141619 |
10-Feb-2005 |
phk |
don't call vprint with NULL.
|
141618 |
10-Feb-2005 |
phk |
Statize malloc types. Don't call vprint with NULL.
|
141617 |
10-Feb-2005 |
phk |
Statize devfs_ops_f
|
141616 |
10-Feb-2005 |
phk |
Make a bunch of malloc types static.
Found by: src/tools/tools/kernxref
|
141497 |
08-Feb-2005 |
njl |
Unroll the loop for calculating the 8.3 filename checksum. In testing on my P3, microbenchmarks show the unrolled version is 78x faster. In actual use (recursive ls), this gives an average of 9% improvement in system time and 2% improvement in wall time.
|
141447 |
07-Feb-2005 |
phk |
Remove vop_destroyvobject()
|
141442 |
07-Feb-2005 |
phk |
Deimplement vop_destroyvobject()
|
141439 |
07-Feb-2005 |
phk |
Remove vop_destroyvobject() initialization.
|
140965 |
29-Jan-2005 |
peadar |
Unbreak a few filesystems for which vnode_create_vobject() wasn't being called in "open", causing mmap() to fail.
Where possible, pass size of file to vnode_create_vobject() rather than having it find it out the hard way via VOP_LOOKUP
Reviewed by: phk
|
140939 |
28-Jan-2005 |
phk |
Make filesystems get rid of their own vnodes vnode_pager object in VOP_RECLAIM().
|
140936 |
28-Jan-2005 |
phk |
Remove unused argument to vrecycle()
|
140904 |
27-Jan-2005 |
peadar |
Make NTFS at least minimally usable after bufobj and GEOM fallout.
mmap() on NTFS files was hosed, returning pages offset from the start of the disk rather than the start of the file. (ie, "cp" of a 1-block file would get you a copy of the boot sector, not the data in the file.) The solution isn't ideal, but gives a functioning filesystem.
Cached vnode lookup was also broken, resulting in vnode haemorrhage. A lookup on the same file twice would give you two vnodes, and the resulting cached pages.
Just recently, mmap() was broken due to a lack of a call to vnode_create_vobject() in ntfs_open().
Discussed with: phk@
|
140822 |
25-Jan-2005 |
phk |
Introduce and use g_vfs_close().
|
140783 |
25-Jan-2005 |
phk |
Take VOP_GETVOBJECT() out to pasture. We use the direct pointer now.
|
140781 |
25-Jan-2005 |
phk |
Kill VOP_CREATEVOBJECT(), it is now the responsibility of the filesystem for a given vnode to create a vnode_pager object if one is needed.
|
140780 |
24-Jan-2005 |
phk |
Don't implement vop_createvobject(), vop_open() and vop_close() manages this for nullfs now.
|
140779 |
24-Jan-2005 |
phk |
Don't call VOP_CREATEVOBJECT(), it's the responsibility of the filesystem which owns the vnode.
|
140776 |
24-Jan-2005 |
phk |
Add null_open() and null_close() which calls null_bypass() and managed the v_object pointer.
|
140768 |
24-Jan-2005 |
phk |
Create a vp->v_object in VFS_FHTOVP() if we want to be exportable with NFS.
We are moving responsibility for creating the vnode_pager object into the filesystems which own the vnode, and this is one of the places we have to cover.
We call vnode_create_vobject() directly because we own the vnode.
If we can get the size easily, pass it as an argument to save the call to VOP_GETATTR() in vnode_create_vobject()
|
140734 |
24-Jan-2005 |
phk |
Kill the VV_OBJBUF and test the v_object for NULL instead.
|
140732 |
24-Jan-2005 |
phk |
Remove "register" keywords.
|
140728 |
24-Jan-2005 |
phk |
Style: Remove the commented out vop_foo_args replicas.
|
140471 |
19-Jan-2005 |
phk |
whitespace nit
|
140470 |
19-Jan-2005 |
phk |
Remove unused coda_fbsd_getpages()
|
140416 |
18-Jan-2005 |
scottl |
Fix an incorrect cast.
Submitted by: Andriy Gapon MFC-after: 3 days.
|
140250 |
14-Jan-2005 |
scottl |
NULL-terminate the . and .. directory entries. Apparently some tools ignore d_namlen and assume that d_name is null-terminated.
Submitted by: Andriy Gapon
|
140249 |
14-Jan-2005 |
scottl |
Replace the min() macro with a test that doesn't truncate the 64-bit values that are used. Thanks to Bruce Evans for pointing this out.
|
140223 |
14-Jan-2005 |
phk |
Eliminate unused and constant arguments to smbfs_vinvalbuf()
|
140222 |
14-Jan-2005 |
phk |
Eliminate constant and unused arguments to nwfs_vinvalbuf()
|
140220 |
14-Jan-2005 |
phk |
Eliminate unused and unnecessary "cred" argument from vinvalbuf()
|
140196 |
13-Jan-2005 |
phk |
Whitespace in vop_vector{} initializations.
|
140181 |
13-Jan-2005 |
phk |
Ditch vfs_object_create() and make the callers call VOP_CREATEVOBJECT() directly.
|
140165 |
13-Jan-2005 |
phk |
Change the generated VOP_ macro implementations to improve type checking and KASSERT coverage.
After this check there is only one "nasty" cast in this code but there is a KASSERT to protect against the wrong argument structure behind that cast.
Un-inlining the meat of VOP_FOO() saves 35kB of text segment on a typical kernel with no change in performance.
We also now run the checking and tracing on VOP's which have been layered by nullfs, umapfs, deadfs or unionfs.
Add new (non-inline) VOP_FOO_AP() functions which take a "struct foo_args" argument and does everything the VOP_FOO() macros used to do with checks and debugging code.
Add KASSERT to VOP_FOO_AP() check for argument type being correct.
Slim down VOP_FOO() inline functions to just stuff arguments into the struct foo_args and call VOP_FOO_AP().
Put function pointer to VOP_FOO_AP() into vop_foo_desc structure and make VCALL() use it instead of the current offsetoff() hack.
Retire vcall() which implemented the offsetoff()
Make deadfs and unionfs use VOP_FOO_AP() calls instead of VCALL(), we know which specific call we want already.
Remove unneeded arguments to VCALL() in nullfs and umapfs bypass functions.
Remove unused vdesc_offset and VOFFSET().
Generally improve style/readability of the generated code.
|
140105 |
12-Jan-2005 |
scottl |
Use off_t when passing and calculating file offsets. While a single extent in UDF is only 32 bits, multiple extents can exist in a file. Also clean up some minor whitespace problems.
Submitted by: John Wehle
|
140104 |
12-Jan-2005 |
scottl |
Don't allow reads past the end of a file.
Submitted by: John Wehle, Andriy Gapon MFC After: 3 days
|
140067 |
11-Jan-2005 |
phk |
Silently ignore forced argument to unmount.
|
140051 |
11-Jan-2005 |
phk |
Wrap the bufobj operations in macros: BO_STRATEGY() and BO_WRITE()
|
140048 |
11-Jan-2005 |
phk |
Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC().
I'm not sure why a credential was added to these in the first place, it is not used anywhere and it doesn't make much sense:
The credentials for syncing a file (ability to write to the file) should be checked at the system call level.
Credentials for syncing one or more filesystems ("none") should be checked at the system call level as well.
If the filesystem implementation needs a particular credential to carry out the syncing it would logically have to the cached mount credential, or a credential cached along with any delayed write data.
Discussed with: rwatson
|
139984 |
10-Jan-2005 |
phk |
whitespace
|
139896 |
08-Jan-2005 |
rwatson |
Annotate that pfs_exit() always acquires and releases two mutexes for every process exist, even if procfs isn't mounted. And one of those mutexes is Giant. No immediate thoughts on fixing this.
|
139790 |
06-Jan-2005 |
imp |
/* -> /*- for copyright notices, minor format tweaks as necessary
|
139776 |
06-Jan-2005 |
imp |
/* -> /*- for copyright notices, minor format tweaks as necessary
|
139745 |
05-Jan-2005 |
imp |
Start each of the license/copyright comments with /*-
|
139664 |
04-Jan-2005 |
phk |
Unsupport forceful unmounts of DEVFS.
After disscussing things I have decided to take the easy and consistent 90% solution instead of aiming for the very involved 99% solution.
If we allow forceful unmounts of DEVFS we need to decide how to handle the devices which are in use through this filesystem at the time.
We cannot just readopt the open devices in the main /dev instance since that would open us to security issues.
For the majority of the devices, this is relatively straightforward as we can just pretend they got revoke(2)'ed.
Some devices get tricky: /dev/console and /dev/tty for instance does a sort of recursive open of the real console device. Other devices may be mmap'ed (kill the processes ?).
And then there are disk devices which are mounted.
The correct thing here would be to recursively unmount the filesystems mounte from devices from our DEVFS instance (forcefully) and if this succeeds, complete the forcefully unmount of DEVFS. But if one of the forceful unmounts fail we cannot complete the forceful unmount of DEVFS, but we are likely to already have severed a lot of stuff in the process of trying.
Event attempting this would be a lot of code for a very far out corner-case which most people would never see or get in touch with.
It's just not worth it.
|
139189 |
22-Dec-2004 |
phk |
Be consistent about flag values passed to device drivers read/write methods:
Read can see O_NONBLOCK and O_DIRECT.
Write can see O_NONBLOCK, O_DIRECT and O_FSYNC.
In addition O_DIRECT is shadowed as IO_DIRECT for now for backwards compatibility.
|
139188 |
22-Dec-2004 |
phk |
Shuffle numeric values of the IO_* flags to match the O_* flags from fcntl.h.
This is in preparation for making the flags passed to device drivers be consistently from fcntl.h for all entrypoints.
Today open, close and ioctl uses fcntl.h flags, while read and write uses vnode.h flags.
|
139085 |
20-Dec-2004 |
phk |
We can only ever get to vgonechrl() from a devfs vnode, so we do not need to reassign the vp->v_op to devfs_specops, we know that is the value already.
Make devfs_specops private to devfs.
|
139083 |
20-Dec-2004 |
phk |
Add a couple of KASSERTS to try to diagnose a problem reported.
|
138841 |
14-Dec-2004 |
phk |
Be a bit more assertive about vnode bypass.
|
138810 |
13-Dec-2004 |
ssouhlal |
Exporting of NTFS filesystem broke in rev 1.70. Fix it.
Approved by: phk, grehan (mentor)
|
138796 |
13-Dec-2004 |
phk |
Don't forget to bypass vnodes in corner cases.
Found by: kkenn and ports/shell/zsh Thanks to: jeffr
|
138791 |
13-Dec-2004 |
phk |
Another FNONBLOCK -> O_NONBLOCK.
Don't unconditionally set IO_UNIT to device drivers in write: nobody checks it, and since it was always set it did not carry information anyway.
|
138790 |
13-Dec-2004 |
phk |
Use O_NONBLOCK instead of FNONBLOCK alias.
|
138788 |
13-Dec-2004 |
phk |
Explicit panic in vop_read/vop_write for devices
|
138784 |
13-Dec-2004 |
phk |
Explicitly panic vop_read/vop_write on fifos.
|
138737 |
12-Dec-2004 |
phk |
Don't deref NULL if no charset-conversion is specified.
Return correct vnode in vop_bmap()
|
138689 |
11-Dec-2004 |
phk |
Handle MNT_UPDATE export requests first and return so we do not interpret the rest of the msdosfs_args structure.
Detected by: marcel
|
138678 |
11-Dec-2004 |
phk |
typo
|
138519 |
07-Dec-2004 |
phk |
First save from editor, *then* commit.
|
138518 |
07-Dec-2004 |
phk |
Fix exports.
|
138509 |
07-Dec-2004 |
phk |
The remaining part of nmount/omount/rootfs mount changes. I cannot sensibly split the conversion of the remaining three filesystems out from the root mounting changes, so in one go:
cd9660: Convert to nmount. Add omount compat shims. Remove dedicated rootfs mounting code. Use vfs_mountedfrom() Rely on vfs_mount.c calling VFS_STATFS()
nfs(client): Convert to nmount (the simple way, mount_nfs(8) is still necessary). Add omount compat shims. Drop COMPAT_PRELITE2 mount arg compatibility.
ffs: Convert to nmount. Add omount compat shims. Remove dedicated rootfs mounting code. Use vfs_mountedfrom() Rely on vfs_mount.c calling VFS_STATFS()
Remove vfs_omount() method, all filesystems are now converted.
Remove MNTK_WANTRDWR, handling RO/RW conversions is a filesystem task, and they all do it now.
Change rootmounting to use DEVFS trampoline:
vfs_mount.c: Mount devfs on /. Devfs needs no 'from' so this is clean. symlink /dev to /. This makes it possible to lookup /dev/foo. Mount "real" root filesystem on /. Surgically move the devfs mountpoint from under the real root filesystem onto /dev in the real root filesystem.
Remove now unnecessary getdiskbyname().
kern_init.c: Don't do devfs mounting and rootvnode assignment here, it was already handled by vfs_mount.c.
Remove now unused bdevvp(), addaliasu() and addalias(). Put the few necessary lines in devfs where they belong. This eliminates the second-last source of bogo vnodes, leaving only the lemming-syncer.
Remove rootdev variable, it doesn't give meaning in a global context and was not trustworth anyway. Correct information is provided by statfs(/).
|
138495 |
06-Dec-2004 |
phk |
Use vfs_mountedfrom().
Since VFS_STATFS() always calls the filesystem with mp->mnt_stat now, the vfs_statfs method is now a no-op. Explain this in a comment.
|
138491 |
06-Dec-2004 |
phk |
Trust vfs_mount to call VFS_STATFS() on all mounts.
|
138490 |
06-Dec-2004 |
phk |
Convert to nmount. Add omount compat.
Unpropagate the sm_args function into the runtime part.
|
138489 |
06-Dec-2004 |
phk |
Convert to nmount. Add omount compat.
Use vfs_mountedon(). Rely on vfs_mount.c calling VFS_STATFS().
|
138488 |
06-Dec-2004 |
phk |
Convert to nmount. Add omount compat.
Same comment about charset conversions apply.
Use vfs_mountedfrom(). Rely on vfs_mount.c calling VFS_STATFS().
|
138487 |
06-Dec-2004 |
phk |
Convert to nmount. Add backwards compat cmount method.
Same comment as msdosfs applies: It would be nice if we had generic option names for charset conversions.
Use vfs_mountefrom(). Rely on vfs_mount.c calling VFS_STATFS().
|
138486 |
06-Dec-2004 |
phk |
Convert nwfs to nmount, but take the low road: There is no way this is ever going to work without a dedicated mount_nwfs(8) program so simply stick struct nwfs_args into a nmount argument and leave it at that.
|
138485 |
06-Dec-2004 |
kan |
Fix a typo in PFS_TRACE.
PR: kern/74461 Submitted by: Craig Rodrigues <rodrigc at crodrigues.org>
|
138484 |
06-Dec-2004 |
phk |
ufs vfs_mountedon(), rely on vfs_mount.c calling VFS_STATFS()
|
138483 |
06-Dec-2004 |
phk |
Use vfs_mountedfrom(), rely on vfs_mount.c calling VFS_STATFS().
|
138481 |
06-Dec-2004 |
phk |
Use vfs_mountedfrom() and rely on vfs_mount.c to call VFS_STATFS()
|
138478 |
06-Dec-2004 |
phk |
Convert coda to nmount.
|
138471 |
06-Dec-2004 |
phk |
Convert msdosfs to nmount.
Add a vfs_cmount() function which converts omount argument stucture to nmount arguments.
Convert vfs_omount() to vfs_mount() and parse nmount arguments.
This is 100% compatible with existing userland.
Later on, but before userland gets converted to nmount we may want to revisit the names of the mountoptions, for instance it may make sense to use consistent options for charset conversion etc.
|
138443 |
06-Dec-2004 |
phk |
Fix warning
|
138412 |
05-Dec-2004 |
phk |
VFS_STATFS(mp, ...) is mostly called with &mp->mnt_stat, but a few cases doesn't. Most of the implementations have grown weeds for this so they copy some fields from mnt_stat if the passed argument isn't that.
Fix this the cleaner way: Always call the implementation on mnt_stat and copy that in toto to the VFS_STATFS argument if different.
|
138367 |
04-Dec-2004 |
phk |
Remove embryonic rootfs mounting facility.
In the near future rootfs mounting will not require special handling in the filesystems.
|
138309 |
02-Dec-2004 |
phk |
Remove the de_devvp and stop VREF'ing it for every vnode we create.
|
138290 |
01-Dec-2004 |
phk |
Back when VOP_* was introduced, we did not have new-style struct initializations but we did have lofty goals and big ideals.
Adjust to more contemporary circumstances and gain type checking.
Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place.
Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.
Give coda correct prototypes and function definitions for all vop_()s.
Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods.
Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector.
Remove a lot of vfs_init since vop_vector is ready to use from the compiler.
Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc.
Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse.
Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)
|
138281 |
01-Dec-2004 |
cperciva |
Fix unvalidated pointer dereference. This is FreeBSD-SA-04:17.procfs.
|
138279 |
01-Dec-2004 |
phk |
hpfs_lookup() should have a vop_cachedlookup_t prototype an corresponding argument.
|
138277 |
01-Dec-2004 |
phk |
Correctly prototype union_write with vop_write_t, not vop_read_t.
|
138270 |
01-Dec-2004 |
phk |
Mechanically change prototypes for vnode operations to use the new typedefs.
|
138106 |
26-Nov-2004 |
phk |
Ignore MNT_NODEV, it is implicit in choice of filesystem these days.
|
138105 |
26-Nov-2004 |
phk |
Eliminate null_open() and use instead null_bypass().
Null_open() was only here to handle MNT_NODEV, but since that does not affect any filesystems anymore, it could only have any effect if you nullfs mounted a devfs but didn't want devices to show up.
If you need that, there are easier ways.
|
138075 |
25-Nov-2004 |
phk |
Use system wide no-op vfs_start function.
|
137867 |
18-Nov-2004 |
phk |
Add dropped implementation of ioctl for fifos.
|
137801 |
17-Nov-2004 |
phk |
Make vnode bypass for fifos (read, write, poll) mandatory.
|
137800 |
17-Nov-2004 |
phk |
Make vnode bypass for devices mandatory.
|
137755 |
15-Nov-2004 |
phk |
Make vnode bypass the default for devices.
Can be disabled in case of problems with vfs.devfs.fops=0 in loader.conf
|
137739 |
15-Nov-2004 |
phk |
Add file ops to fifofs so that we can bypass vnodes (and Giant) for the heavy-duty operations (read, write, poll/select, kqueue).
Disabled for now, enable with "vfs.fifofs.fops=1" in loader.conf.
|
137726 |
15-Nov-2004 |
phk |
Make VOP_BMAP return a struct bufobj for the underlying storage device instead of a vnode for it.
The vnode_pager does not and should not have any interest in what the filesystem uses for backend.
(vfs_cluster doesn't use the backing store argument.)
|
137679 |
13-Nov-2004 |
phk |
Integrate most of vop_revoke() into devfs_revoke() where it belongs.
|
137678 |
13-Nov-2004 |
phk |
Add the devfs_fp_check() function which helps us get from a struct file to a cdev and a devsw, doing all the relevant checks along the way.
Add the check to see if fp->f_vnode->v_rdev differs from our cached fp->f_data copy of our cdev. If it does the device was revoked and we return ENXIO.
|
137676 |
13-Nov-2004 |
phk |
VOP_REVOKE() is only ever for VCHR vnodes, so unionfs does not need a vop_revoke() method.
|
137673 |
13-Nov-2004 |
phk |
fifos doesn't need a vop_lookup, the default will do fine.
|
137647 |
13-Nov-2004 |
phk |
Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST.
Use this in all the places where sleeping with the lock held is not an issue.
The distinction will become significant once we finalize the exact lock-type to use for this kind of case.
|
137488 |
09-Nov-2004 |
trhodes |
Remove stale comment after previous commit.
Noticed by: pjd
|
137480 |
09-Nov-2004 |
phk |
Detect root mount attempts on the flag, not on the NULL path.
|
137479 |
09-Nov-2004 |
phk |
Refuse attempts to mount root filesystem
|
137478 |
09-Nov-2004 |
phk |
Refuse attemps to mount root filesystem
|
137382 |
08-Nov-2004 |
phk |
Add optional device vnode bypass to DEVFS.
The tunable vfs.devfs.fops controls this feature and defaults to off.
When enabled (vfs.devfs.fops=1 in loader), device vnodes opened through a filedescriptor gets a special fops vector which instead of the detour through the vnode layer goes directly to DEVFS.
Amongst other things this allows us to run Giant free read/write to device drivers which have been weaned off D_NEEDGIANT.
Currently this means /dev/null, /dev/zero, disks, (and maybe the random stuff ?)
On a 700MHz K7 machine this doubles the speed of dd if=/dev/zero of=/dev/null bs=1 count=1000000
This roughly translates to shaving 2usec of each read/write syscall.
The poll/kqfilter paths need more work before they are giant free, this work is ongoing in p4::phk_bufwork
Please test this and report any problems, LORs etc.
|
137308 |
06-Nov-2004 |
phk |
Properly implement a default version of VOP_GETWRITEMOUNT.
Remove improper access to vop_stdgetwritemount() which should and will instead rely on the VOP default path.
|
137195 |
04-Nov-2004 |
phk |
Add back securelevel check for disks.
XXX: This should live in geom_dev.c but we don't have access to the cred there. XXX: XXX: This may not matter anymore since filesystems use geom_vfs.
|
137185 |
04-Nov-2004 |
phk |
s/ffs/ntfs/
Fix error handling to not use VOP_CLOSE() on the disk.
Spotted by: tegge
|
137172 |
03-Nov-2004 |
phk |
Make a more whole-hearted attempt at GEOM'ifying NTFS.
I must have been sleepy when I did the first pass.
Spotted by: tegge
|
137047 |
29-Oct-2004 |
phk |
Don't give disks special treatment, they don't come this way anymore.
|
137043 |
29-Oct-2004 |
phk |
Remove VOP_SPECSTRATEGY() from the system.
|
137041 |
29-Oct-2004 |
phk |
Move NTFS to GEOM backing instead of DEVFS.
For details, please see src/sys/ufs/ffs/ffs_vfsops.c 1.250.
|
137040 |
29-Oct-2004 |
phk |
Move HPFS to GEOM backing instead of DEVFS.
For details, please see src/sys/ufs/ffs/ffs_vfsops.c 1.250.
|
137038 |
29-Oct-2004 |
phk |
Move CD9660 to GEOM backing instead of DEVFS.
For details, please see src/sys/ufs/ffs/ffs_vfsops.c 1.250.
|
137037 |
29-Oct-2004 |
phk |
Move UDF to GEOM backing instead of DEVFS.
For details, please see src/sys/ufs/ffs/ffs_vfsops.c 1.250.
|
137036 |
29-Oct-2004 |
phk |
Move MSDOSFS to GEOM backing instead of DEVFS.
For details, please see src/sys/ufs/ffs/ffs_vfsops.c 1.250.
|
137029 |
29-Oct-2004 |
phk |
Give dev_strategy() an explict cdev argument in preparation for removing buf->b-dev.
Put a bio between the buf passed to dev_strategy() and the device driver strategy routine in order to not clobber fields in the buf.
Assert copyright on vfs_bio.c and update copyright message to canonical text. There is no legal difference between John Dysons two-clause abbreviated BSD license and the canonical text.
|
137008 |
28-Oct-2004 |
phk |
Reduce the locking activity by epsilon by checking VNON condition before releasing the mountlock.
|
137006 |
28-Oct-2004 |
phk |
What can I say: don't allow people to mount DEVFS with option "nodev".
|
136991 |
27-Oct-2004 |
phk |
Eliminate unnecessary KASSERTs.
Don't use bp->b_vp in VOP_STRATEGY: the vnode is passed in as an argument.
|
136966 |
26-Oct-2004 |
phk |
Put the I/O block size in bufobj->bo_bsize.
We keep si_bsize_phys around for now as that is the simplest way to pull the number out of disk device drivers in devfs_open(). The correct solution would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth when filesystems sit on GEOM, so don't bother for now.
|
136943 |
25-Oct-2004 |
phk |
Loose the v_dirty* and v_clean* alias macros.
Check the count field where we just want to know the full/empty state, rather than using TAILQ_EMPTY() or TAILQ_FIRST().
|
136770 |
22-Oct-2004 |
phk |
Alas, poor SPECFS! -- I knew him, Horatio; A filesystem of infinite jest, of most excellent fancy: he hath taught me lessons a thousand times; and now, how abhorred in my imagination it is! my gorge rises at it. Here were those hacks that I have curs'd I know not how oft. Where be your kludges now? your workarounds? your layering violations, that were wont to set the table on a roar?
Move the skeleton of specfs into devfs where it now belongs and bury the rest.
|
136152 |
05-Oct-2004 |
jhb |
Rework how we store process times in the kernel such that we always store the raw values including for child process statistics and only compute the system and user timevals on demand.
- Fix the various kern_wait() syscall wrappers to only pass in a rusage pointer if they are going to use the result. - Add a kern_getrusage() function for the ABI syscalls to use so that they don't have to play stackgap games to call getrusage(). - Fix the svr4_sys_times() syscall to just call calcru() to calculate the times it needs rather than calling getrusage() twice with associated stackgap, etc. - Add a new rusage_ext structure to store raw time stats such as tick counts for user, system, and interrupt time as well as a bintime of the total runtime. A new p_rux field in struct proc replaces the same inline fields from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux field in struct proc contains the "raw" child time usage statistics. ruadd() has been changed to handle adding the associated rusage_ext structures as well as the values in rusage. Effectively, the values in rusage_ext replace the ru_utime and ru_stime values in struct rusage. These two fields in struct rusage are no longer used in the kernel. - calcru() has been split into a static worker function calcru1() that calculates appropriate timevals for user and system time as well as updating the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a copy of the process' p_rux structure to compute the timevals after updating the runtime appropriately if any of the threads in that process are currently executing. It also now only locks sched_lock internally while doing the rux_runtime fixup. calcru() now only requires the caller to hold the proc lock and calcru1() only requires the proc lock internally. calcru() also no longer allows callers to ask for an interrupt timeval since none of them actually did. - calcru() now correctly handles threads executing on other CPUs. - A new calccru() function computes the child system and user timevals by calling calcru1() on p_crux. Note that this means that any code that wants child times must now call this function rather than reading from p_cru directly. This function also requires the proc lock. - This finishes the locking for rusage and friends so some of the Giant locks in exit1() and kern_wait() are now gone. - The locking in ttyinfo() has been tweaked so that a shared lock of the proctree lock is used to protect the process group rather than the process group lock. By holding this lock until the end of the function we now ensure that the process/thread that we pick to dump info about will no longer vanish while we are trying to output its info to the console.
Submitted by: bde (mostly) MFC after: 1 month
|
136146 |
05-Oct-2004 |
takawata |
Minor Bug fix. Some file was not translated.
|
136135 |
05-Oct-2004 |
takawata |
Fix unionfs problems when a directory is mounted on other directory with different file systems. This may cause ill things with my previous fix. Now it translate fsid of direct child of mount point directory only.
Pointed out by: Uwe Doering
|
136060 |
02-Oct-2004 |
takawata |
Fix a problem when you try to mount a directory on another directory belongs to the same filesystem. In this problem, getcwd(3) will fail.
I found the problem two years ago and I have forgotten to merge.
http://docs.FreeBSD.org/cgi/mid.cgi?200202251435.XAA91094
|
136004 |
01-Oct-2004 |
das |
Don't PHOLD() the target process in procfs, since this is already done in pseudofs. Moreover, PHOLD() may block between the p_candebug() access check and the actual operation.
|
135727 |
24-Sep-2004 |
phk |
XXX mark two places where we do not hold a threadcount on the dev when frobbing the cdevsw.
In both cases we examine only the cdevsw and it is a good question if we weren't better off copying those properties into the cdev in the first place. This question will be revisited.
|
135722 |
24-Sep-2004 |
phk |
Hold proper thread count while frobbing drivers ioctl.
|
135719 |
24-Sep-2004 |
phk |
Remove devsw() call missed in last commit.
|
135706 |
24-Sep-2004 |
phk |
Use def_re[fl]thread().
Retire various old compatibility helpers.
|
135617 |
23-Sep-2004 |
phk |
Eliminate DEV_STRATEGY() macro: call dev_strategy() directly.
Make dev_strategy() handle errors and departing devices properly.
|
135613 |
23-Sep-2004 |
phk |
Do not use devsw() but si_devsw direction. This is still bogus but a fair bit less so.
|
135600 |
23-Sep-2004 |
phk |
Do not refcount the cdevsw, but rather maintain a cdev->si_threadcount of the number of threads which are inside whatever is behind the cdevsw for this particular cdev.
Make the device mutex visible through dev_lock() and dev_unlock(). We may want finer granularity later.
Replace spechash_mtx use with dev_lock()/dev_unlock().
|
135578 |
22-Sep-2004 |
phk |
Pointy hat please!
Refuse VCHR not VREG.
|
135541 |
21-Sep-2004 |
phk |
De support opening device nodes on CD9660 filesystems. They are still visible, they can still be seen, but they cannot be opened. Use DEVFS for that.
|
135459 |
19-Sep-2004 |
phk |
The getpages VOP was a good stab at getting scatter/gather I/O without too much kernel copying, but it is not the right way to do it, and it is in the way for straightening out the buffer cache.
The right way is to pass the VM page array down through the struct bio to the disk device driver and DMA directly in to/out off the physical memory. Once the VM/buf thing is sorted out it is next on the list.
Retire most of vnode method. ffs_getpages(). It is not clear if what is left shouldn't be in the default implementation which we now fall back to.
Retire specfs_getpages() as well, as it has no users now.
|
135280 |
15-Sep-2004 |
phk |
Remove unused B_WRITEINPROG flag
|
135135 |
13-Sep-2004 |
phk |
Remove the buffercache/vnode side of BIO_DELETE processing in preparation for integration of p4::phk_bufwork. In the future, local filesystems will talk to GEOM directly and they will consequently be able to issue BIO_DELETE directly. Since the removal of the fla driver, BIO_DELETE has effectively been a no-op anyway.
|
134945 |
08-Sep-2004 |
tjr |
Reduce the size of struct defid's defid_dirclust, defid_dirofs and (disabled) defid_gen members from u_long to u_int32_t so that alignment requirements don't cause the structure to become larger than struct fid on LP64 platforms. This fixes NFS exports of msdos filesystems on at least amd64.
PR: 71173
|
134942 |
08-Sep-2004 |
tjr |
Merge from NetBSD: Fix a problem in previous: we can't blindly assume that we have wincnt entries available at the offset the file has been found. If the dos directory entry is not preceded by appropriate number of long name entries (happens e.g. when the filesystem is corrupted, or when the filename complies to DOS rules and doesn't use any long name entry), we would overwrite random directory entries.
There are still some problems, the whole thing has to be revisited and solved right.
Submitted by: Xin LI
|
134941 |
08-Sep-2004 |
tjr |
Merge from NetBSD: Fix a panic that occurred when trying to traverse a corrupt msdosfs filesystem. With this particular corruption, the code in pcbmap() would compute an offset into an array that was way out of bounds, so check the bounds before trying to access and return an error if the offset would be out of bounds.
Submitted by: Xin LI
|
134899 |
07-Sep-2004 |
phk |
Create simple function init_va_filerev() for initializing a va_filerev field.
Replace three instances of longhaired initialization va_filerev fields.
Added XXX comment wondering why we don't use random bits instead of uptime of the system for this purpose.
|
134897 |
07-Sep-2004 |
phk |
Explicitly pass vnode to smbfs_doio() function.
|
134896 |
07-Sep-2004 |
phk |
Explicitly pass the vnode to the nw_doio() function.
|
134807 |
05-Sep-2004 |
tjr |
Temporarily back out revision 1.77. This changed cd9660_getattr() and cd9660_readdir() to return the address of the file's first data block as the inode number instead of the address of the directory entry, but neglected to update cd9660_vget_internal() for the new inode numbering scheme.
Since the NFS server calls VFS_VGET (cd9660_vget()) with inode numbers returned through VOP_READDIR (cd9660_readdir()) when servicing a READDIRPLUS request, these two interfaces must agree on the numbering scheme; failure to do so caused panics and/or bogus information about the entries to be returned to clients using READDIRPLUS (Solaris, FreeBSD w/ mount -o rdirplus).
PR: 63446
|
134647 |
02-Sep-2004 |
rwatson |
Back out pseudo_vnops.c:1.45, which was a workaround for pfind() returning incompletely initialized processes. This problem was eliminated by kern_proc.c:1.215, which causes pfind() not to return processes in the PRS_NEW state.
|
134585 |
01-Sep-2004 |
brooks |
General modernization of coda: - Ditch NVCODA - Don't use a static major - Don't declare functions extern
Reviewed by: peter
|
134542 |
30-Aug-2004 |
peter |
Kill count device support from config. I've changed the last few remaining consumers to have the count passed as an option. This is i4b, pc98/wdc, and coda.
Bump configvers.h from 500013 to 600000.
Remove heuristics that tried to parse "device ed5" as 5 units of the ed device. This broke things like the snd_emu10k1 device, which required quotes to make it parse right. The no-longer-needed quotes have been removed from NOTES, GENERIC etc. eg, I've removed the quotes from: device snd_maestro device "snd_maestro3" device snd_mss
I believe everything will still compile and work after this.
|
134374 |
27-Aug-2004 |
tjr |
Remove bogus vrele() call added in previous.
|
134345 |
26-Aug-2004 |
tjr |
Improve the robustness of MSDOSFSMNT_KICONV handling: - Use copyinstr() to read cs_win, cs_dos, cs_local strings from the mount argument structure instead of reading through user-space pointers(!). - When mounting a filesystem, or updating an existing mount, only try to update the iconv handles from the information in the mount argument structure if the structure itself has the MSDOSFSMNT_KICONV flag set. - Attempt to handle failure of update_mp() in the MNT_UPDATE case.
|
133776 |
15-Aug-2004 |
des |
Release the vnode cache mutex when calling vgone(), since vgone() may sleep. This makes pfs_exit() even less efficient than before, but on the bright side, the vnode cache mutex no longer needs to be recursive.
|
133741 |
15-Aug-2004 |
jmg |
Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops.
Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases).
Reviewed by: green, rwatson (both earlier versions)
|
133668 |
13-Aug-2004 |
rwatson |
Commit a work-around for a more general bug involving process state: check whether p_ucred is NULL or not in pfs_getattr() before dereferencing the credential, and return ENOENT if there wasn't one.
This is a symptom of a larger problem, wherein pfind() can return references to incompletely initialized processes, and we instead ought to not return them, or check the process state before acting on the process.
Reported by: kris Discussed with: tjr, others
|
133327 |
08-Aug-2004 |
phk |
use bufdone() not biodone().
|
133326 |
08-Aug-2004 |
phk |
Use bufdone(), not biodone().
|
133287 |
07-Aug-2004 |
phk |
Push all changes to disk before downgrading a mount from rw to ro.
|
132902 |
30-Jul-2004 |
phk |
Put a version element in the VFS filesystem configuration structure and refuse initializing filesystems with a wrong version. This will aid maintenance activites on the 5-stable branch.
s/vfs_mount/vfs_omount/
s/vfs_nmount/vfs_mount/
Name our filesystems mount function consistently.
Eliminate the namiedata argument to both vfs_mount and vfs_omount. It was originally there to save stack space. A few places abused it to get hold of some credentials to pass around. Effectively it is unused.
Reorganize the root filesystem selection code.
|
132805 |
28-Jul-2004 |
phk |
Remove global variable rootdevs and rootvp, they are unused as such.
Add local rootvp variables as needed.
Remove checks for miniroot's in the swappartition. We never did that and most of the filesystems could never be used for that, but it had still been copy&pasted all over the place.
|
132772 |
28-Jul-2004 |
kan |
Avoid casts as lvalues.
|
132765 |
28-Jul-2004 |
kan |
Avoid casts as lvalues.
|
132653 |
26-Jul-2004 |
cperciva |
Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags.
The old name is still defined, but will be removed in a few days (unless I hear any complaints...)
Discussed with: rwatson, scottl Requested by: jhb
|
132547 |
22-Jul-2004 |
rwatson |
In devfs_allocv(), rather than assigning 'td = curthread', assert that the caller passes in a td that is curthread, and consistently pass 'td' into vget(). Remove some bogus logic that passed in td or curthread conditional on td being non-NULL, which seems redundant in the face of the earlier assignment of td to curthread if td is NULL.
In devfs_symlink(), cache the passed thread in 'td' so we don't have to keep retrieving it from the 'ap' structure, and assert that td is curthread (since we dereference it to get thread-local td_ucred). Use 'td' in preference to curthread for later lockmgr calls, since they are equal.
|
132199 |
15-Jul-2004 |
phk |
Do a pass over all modules in the kernel and make them return EOPNOTSUPP for unknown events.
A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything".
|
132094 |
13-Jul-2004 |
phk |
Another LINT compilation fix
|
132093 |
13-Jul-2004 |
phk |
Make LINT compile
|
132037 |
12-Jul-2004 |
rwatson |
Remove 'td = curthread' that shadows the arguments to coda_root().
Missed by: alfred
|
132023 |
12-Jul-2004 |
alfred |
Make VFS_ROOT() and vflush() take a thread argument. This is to allow filesystems to decide based on the passed thread which vnode to return. Several filesystems used curthread, they now use the passed thread.
|
131924 |
10-Jul-2004 |
marcel |
Update for the KDB framework: o Call kdb_enter() instead of Debugger().
|
131923 |
10-Jul-2004 |
marcel |
Update for the KDB framework: o Call kdb_enter() instead of Debugger(). o Make debugging code conditional upon KDB instead of DDB.
|
131871 |
09-Jul-2004 |
des |
Accumulate directory entries in a fixed-length sbuf, and uiomove them in one go before returning. This avoids calling uiomove() while holding allproc_lock.
Don't adjust uio->uio_offset manually, uiomove() does that for us.
Don't drop allproc_lock before calling panic().
Suggested by: alfred
|
131551 |
04-Jul-2004 |
phk |
When we traverse the vnodes on a mountpoint we need to look out for our cached 'next vnode' being removed from this mountpoint. If we find that it was recycled, we restart our traversal from the start of the list.
Code to do that is in all local disk filesystems (and a few other places) and looks roughly like this:
MNT_ILOCK(mp); loop: for (vp = TAILQ_FIRST(&mp...); (vp = nvp) != NULL; nvp = TAILQ_NEXT(vp,...)) { if (vp->v_mount != mp) goto loop; MNT_IUNLOCK(mp); ... MNT_ILOCK(mp); } MNT_IUNLOCK(mp);
The code which takes vnodes off a mountpoint looks like this:
MNT_ILOCK(vp->v_mount); ... TAILQ_REMOVE(&vp->v_mount->mnt_nvnodelist, vp, v_nmntvnodes); ... MNT_IUNLOCK(vp->v_mount); ... vp->v_mount = something;
(Take a moment and try to spot the locking error before you read on.)
On a SMP system, one CPU could have removed nvp from our mountlist but not yet gotten to assign a new value to vp->v_mount while another CPU simultaneously get to the top of the traversal loop where it finds that (vp->v_mount != mp) is not true despite the fact that the vnode has indeed been removed from our mountpoint.
Fix:
Introduce the macro MNT_VNODE_FOREACH() to traverse the list of vnodes on a mountpoint while taking into account that vnodes may be removed from the list as we go. This saves approx 65 lines of duplicated code.
Split the insmntque() which potentially moves a vnode from one mount point to another into delmntque() and insmntque() which does just what the names say.
Fix delmntque() to set vp->v_mount to NULL while holding the mountpoint lock.
|
131526 |
03-Jul-2004 |
phk |
Remove "register" keyword and trailing white space.
|
131523 |
03-Jul-2004 |
tjr |
By popular request, add a workaround that allows large (>128GB or so) FAT32 filesystems to be mounted, subject to some fairly serious limitations.
This works by extending the internal pseudo-inode-numbers generated from the file's starting cluster number to 64-bits, then creating a table mapping these into arbitrary 32-bit inode numbers, which can fit in struct dirent's d_fileno and struct vattr's va_fileid fields. The mappings do not persist across unmounts or reboots, so it's not possible to export these filesystems through NFS. The mapping table may grow to be rather large, and may grow large enough to exhaust kernel memory on filesystems with millions of files.
Don't enable this option unless you understand the consequences.
|
131003 |
24-Jun-2004 |
rwatson |
Remove spls from portal_open(). Acquire socket lock while sleeping waiting for the socket to connect and use msleep() on the socket mute rather than tsleep(). Acquire socket buffer mutexes around read-modify-write of socket buffer flags.
|
130994 |
23-Jun-2004 |
scottl |
Make the udf_vnops side endian clean.
|
130986 |
23-Jun-2004 |
scottl |
First half of making UDF be endian-clean. This addresses the vfsops side.
|
130960 |
23-Jun-2004 |
bde |
Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of depending on namespace pollution in <sys/vnode.h> for the definition of mutex interfaces used in SOCKBUF_*LOCK().
Sorted includes.
Removed unused includes.
|
130952 |
23-Jun-2004 |
rwatson |
Remove unlocked read annotation for sbspace(); the read is locked.
|
130678 |
18-Jun-2004 |
phk |
Reduce a fair bit of the atomics because we are now called with a lock from kern_conf.c and cdev's act a lot more like real objects these days.
|
130665 |
18-Jun-2004 |
rwatson |
Merge some additional leaf node socket buffer locking from rwatson_netperf:
Introduce conditional locking of the socket buffer in fifofs kqueue filters; KNOTE() will be called holding the socket buffer locks in fifofs, but sometimes the kqueue() system call will poll using the same entry point without holding the socket buffer lock.
Introduce conditional locking of the socket buffer in the socket kqueue filters; KNOTE() will be called holding the socket buffer locks in the socket code, but sometimes the kqueue() system call will poll using the same entry points without holding the socket buffer lock.
Simplify the logic in sodisconnect() since we no longer need spls.
NOTE: To remove conditional locking in the kqueue filters, it would make sense to use a separate kqueue API entry into the socket/fifo code when calling from the kqueue() system call.
|
130653 |
17-Jun-2004 |
rwatson |
Merge additional socket buffer locking from rwatson_netperf:
- Lock down low hanging fruit use of sb_flags with socket buffer lock.
- Lock down low hanging fruit use of so_state with socket lock.
- Lock down low hanging fruit use of so_options.
- Lock down low-hanging fruit use of sb_lowwat and sb_hiwat with socket buffer lock.
- Annotate situations in which we unlock the socket lock and then grab the receive socket buffer lock, which are currently actually the same lock. Depending on how we want to play our cards, we may want to coallesce these lock uses to reduce overhead.
- Convert a if()->panic() into a KASSERT relating to so_state in soaccept().
- Remove a number of splnet()/splx() references.
More complex merging of socket and socket buffer locking to follow.
|
130640 |
17-Jun-2004 |
phk |
Second half of the dev_t cleanup.
The big lines are: NODEV -> NULL NOUDEV -> NODEV udev_t -> dev_t udev2dev() -> findcdev()
Various minor adjustments including handling of userland access to kernel space struct cdev etc.
|
130585 |
16-Jun-2004 |
phk |
Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.
|
130551 |
16-Jun-2004 |
julian |
Nice, is a property of a process as a whole.. I mistakenly moved it to the ksegroup when breaking up the process structure. Put it back in the proc structure.
|
130513 |
15-Jun-2004 |
rwatson |
Grab the socket buffer send or receive mutex when performing a read-modify-write on the sb_state field. This commit catches only the "easy" ones where it doesn't interact with as yet unmerged locking.
|
130480 |
14-Jun-2004 |
rwatson |
The socket field so_state is used to hold a variety of socket related flags relating to several aspects of socket functionality. This change breaks out several bits relating to send and receive operation into a new per-socket buffer field, sb_state, in order to facilitate locking. This is required because, in order to provide more granular locking of sockets, different state fields have different locking properties. The following fields are moved to sb_state:
SS_CANTRCVMORE (so_state) SS_CANTSENDMORE (so_state) SS_RCVATMARK (so_state)
Rename respectively to:
SBS_CANTRCVMORE (so_rcv.sb_state) SBS_CANTSENDMORE (so_snd.sb_state) SBS_RCVATMARK (so_rcv.sb_state)
This facilitates locking by isolating fields to be located with other identically locked fields, and permits greater granularity in socket locking by avoiding storing fields with different locking semantics in the same short (avoiding locking conflicts). In the future, we may wish to coallesce sb_state and sb_flags; for the time being I leave them separate and there is no additional memory overhead due to the packing/alignment of shorts in the socket buffer structure.
|
129911 |
01-Jun-2004 |
truckman |
Add MSG_NBIO flag option to soreceive() and sosend() that causes them to behave the same as if the SS_NBIO socket flag had been set for this call. The SS_NBIO flag for ordinary sockets is set by fcntl(fd, F_SETFL, O_NONBLOCK).
Pass the MSG_NBIO flag to the soreceive() and sosend() calls in fifo_read() and fifo_write() instead of frobbing the SS_NBIO flag on the underlying socket for each I/O operation. The O_NONBLOCK flag is a property of the descriptor, and unlike ordinary sockets, fifos may be referenced by multiple descriptors.
|
129880 |
30-May-2004 |
phk |
add missing #include <sys/module.h>
|
129355 |
17-May-2004 |
truckman |
Switch from using the vnode interlock to a private mutex in fifo_open() to avoid lock order problems when manipulating the sockets associated with the fifo.
Minor optimization of a couple of calls to fifo_cleanup() from fifo_open().
|
128992 |
06-May-2004 |
alc |
Make vm_page's PG_ZERO flag immutable between the time of the page's allocation and deallocation. This flag's principal use is shortly after allocation. For such cases, clearing the flag is pointless. The only unusual use of PG_ZERO is in vfs_bio_clrbuf(). However, allocbuf() never requests a prezeroed page. So, vfs_bio_clrbuf() never sees a prezeroed page.
Reviewed by: tegge@
|
128171 |
12-Apr-2004 |
phk |
Do not drop Giant around the poll method yet, we're not ready for it.
|
128019 |
07-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson.
Approved by: core, peter, alc, rwatson
|
127694 |
01-Apr-2004 |
pjd |
Remove ps_argsopen from this check, because of two reasons: 1. This check if wrong, because it is true by default (kern.ps_argsopen is 1 by default) (p_cansee() is not even checked). 2. Sysctl kern.ps_argsopen is going away.
|
127652 |
31-Mar-2004 |
rwatson |
Export uipc_connect2() from uipc_usrreq.c instead of unp_connect2(), and consume that interface in portalfs and fifofs instead. In the new world order, unp_connect2() assumes that the unpcb mutex is held, whereas uipc_connect2() validates that the passed sockets are UNIX domain sockets, then grabs the mutex.
NB: the portalfs and fifofs code gets down and dirty with UNIX domain sockets. Maybe this is a bad thing.
|
127603 |
30-Mar-2004 |
scottl |
Catch all cases where bread() returns an error and a valid *bp, and release the *bp.
Obtained from: DragonFlyBSD
|
127592 |
29-Mar-2004 |
peter |
Clean up the stub fake vnode locking implemenations. The main reason this stuff was here (NFS) was fixed by Alfred in November. The only remaining consumer of the stub functions was umapfs, which is horribly horribly broken. It has missed out on about the last 5 years worth of maintenence that was done on nullfs (from which umapfs is derived). It needs major work to bring it up to date with the vnode locking protocol. umapfs really needs to find a caretaker to bring it into the 21st century.
Functions GC'ed: vop_noislocked, vop_nolock, vop_nounlock, vop_sharedlock.
|
126998 |
14-Mar-2004 |
rwatson |
Don't reject FAT file systems with a number of "Heads" greater than 255; USB keychains exist that use 256 as the number of heads. This check has also been removed in Darwin (along with most of the other head/sector sanity checks).
|
126975 |
14-Mar-2004 |
green |
When taking event callbacks (like process_exit) out from under Giant, those which do not lock Giant themselves will be exposed. Unbreak pfs_exit().
|
126858 |
11-Mar-2004 |
phk |
When I was a kid my work table was one cluttered mess an cleaning it up were a rather overwhelming task. I soon learned that if you don't know where you're going to store something, at least try to pile it next to something slightly related in the hope that a pattern emerges.
Apply the same principle to the ffs/snapshot/softupdates code which have leaked into specfs: Add yet a buf-quasi-method and call it from the only two places I can see it can make a difference and implement the magic in ffs_softdep.c where it belongs.
It's not pretty, but at least it's one less layer violated.
|
126851 |
11-Mar-2004 |
phk |
Remove unused second arg to vfinddev(). Don't call addaliasu() on VBLK nodes.
|
126823 |
10-Mar-2004 |
phk |
Don't call devsw() more than we need to, and in particular do not expose ourselves to device removal by not checking for it the second time.
Use count_dev(dev) rather than vcount(vp)
|
126532 |
03-Mar-2004 |
scottl |
Change __FUNCTION__ to __func__
Submitted by: Stefan Farfeleder
|
126425 |
01-Mar-2004 |
rwatson |
Rename dup_sockaddr() to sodupsockaddr() for consistency with other functions in kern_socket.c.
Rename the "canwait" field to "mflags" and pass M_WAITOK and M_NOWAIT in from the caller context rather than "1" or "0".
Correct mflags pass into mac_init_socket() from previous commit to not include M_ZERO.
Submitted by: sam
|
126191 |
24-Feb-2004 |
phk |
Do not attempt to open NODEV
|
126133 |
23-Feb-2004 |
tjr |
Fix comment containing vop_readdir_args contents: a_cookies is really u_long ** not u_long *.
|
126132 |
23-Feb-2004 |
tjr |
cookies is an array of u_long, not u_int, so MALLOC() it accordingly. Allocating it with the wrong size could have caused corruption on 64-bit architectures.
|
126086 |
21-Feb-2004 |
bde |
Fixed a serious off by 1 error. The cluster-in-use bitmap was overrun by 1 u_int if the number of clusters was 1 more than a multiple of (8 * sizeof(u_int)). The bitmap is malloced and large (often huge), so fatal overrun probably only occurred if the number of clusters was 1 more than 1 multiple of PAGE_SIZE/8.
|
126082 |
21-Feb-2004 |
phk |
Device megapatch 6/6:
This is what we came here for: Hang dev_t's from their cdevsw, refcount cdevsw and dev_t and generally keep track of things a lot better than we used to:
Hold a cdevsw reference around all entrances into the device driver, this will be necessary to safely determine when we can unload driver code.
Hold a dev_t reference while the device is open.
KASSERT that we do not enter the driver on a non-referenced dev_t.
Remove old D_NAG code, anonymous dev_t's are not a problem now.
When destroy_dev() is called on a referenced dev_t, move it to dead_cdevsw's list. When the refcount drops, free it.
Check that cdevsw->d_version is correct. If not, set all methods to the dead_*() methods to prevent entrance into driver. Print warning on console to this effect. The device driver may still explode if it is also incompatible with newbus, but in that case we probably didn't get this far in the first place.
|
126081 |
21-Feb-2004 |
phk |
Device megapatch 5/6:
Remove the unused second argument from udev2dev().
Convert all remaining users of makedev() to use udev2dev(). The semantic difference is that udev2dev() will only locate a pre-existing dev_t, it will not line makedev() create a new one.
Apart from the tiny well controlled windown in D_PSEUDO drivers, there should no longer be any "anonymous" dev_t's in the system now, only dev_t's created with make_dev() and make_dev_alias()
|
126080 |
21-Feb-2004 |
phk |
Device megapatch 4/6:
Introduce d_version field in struct cdevsw, this must always be initialized to D_VERSION.
Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
|
126019 |
19-Feb-2004 |
phk |
Report the correct length for symlink entries.
|
125992 |
19-Feb-2004 |
tjr |
Use size_t or ssize_t wherever appropriate instead of casting from int * to size_t *, which is incorrect because they may have different widths. This caused some subtle forms of corruption, the mostly frequently reported one being that the last character of a filename was sometimes duplicated on amd64.
|
125942 |
17-Feb-2004 |
trhodes |
Do not place dirmask in unnamed padding. Move it to the bottom of this list where it should have been added originally.
Prodded by: bde
|
125934 |
17-Feb-2004 |
tjr |
If the "next free cluster" field of the FSInfo block is 0xFFFFFFFF, it means that the correct value is unknown. Since this value is just a hint to improve performance, initially assume that the first non-reserved cluster is free, then correct this assumption if necessary before writing the FSInfo block back to disk.
PR: 62826 MFC after: 2 weeks
|
125855 |
15-Feb-2004 |
phk |
White-space align a struct definition. Move a SYSINIT to the file where it belongs.
|
125796 |
14-Feb-2004 |
bde |
Fixed some style bugs: - don't unlock the vnode after vinvalbuf() only to have to relock it almost immediately. - don't refer to devices classified by vn_isdisk() as block devices.
|
125739 |
12-Feb-2004 |
bde |
MFffs (ffs_vfsops.c 1.227: clean up open mode bandaid). This reduces gratuitous differences with ffs a little.
|
125671 |
10-Feb-2004 |
nectar |
Fix a panic in pseudofs(9) that could occur when doing an I/O operation with a large request or large offset.
Reported by: Joel Ray Holveck <joelh@piquan.org> Submitted by: des
|
125637 |
10-Feb-2004 |
tjr |
Fixes problems that occurred when a file was removed and a directory created with the same name, and vice versa: - Immediately recycle vnodes of files & directories that have been deleted or renamed. - When looking an entry in the VFS name cache or smbfs's private cache, make sure the vnode type is consistent with the type of file the server thinks it is, and re-create the vnode if it isn't.
The alternative to this is to recycle vnodes unconditionally when their use count drops to 0, but this would make all the caching we do mostly useless.
PR: 62342 MFC after: 2 weeks
|
125454 |
04-Feb-2004 |
jhb |
Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists.
Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
|
124804 |
21-Jan-2004 |
cperciva |
Fix style(9) of my previous commit.
Noticed by: nate Approved by: nate, rwatson (mentor)
|
124798 |
21-Jan-2004 |
cperciva |
Allow devfs path rules to work on directories. Without this fix, devfs rule add path fd unhide is a no-op, while it should unhide the fd subdirectory.
Approved by: phk, rwatson (mentor) PR: kern/60897
|
124728 |
19-Jan-2004 |
kan |
Spell magic '16' number as IO_SEQSHIFT.
|
124600 |
16-Jan-2004 |
green |
Do not allow operations which cause known file-system corruption.
|
124599 |
16-Jan-2004 |
green |
Remove a warning.
|
124593 |
16-Jan-2004 |
green |
Fix an upper-vnode leak created in revision 1.52. When an upper-layer file has been removed, it should be purged from the cache, but it need not be removed from the directory stack causing corruption; instead, it will simply be removed once the last references and holds on it are dropped at the end of the unlink/rmdir system calls, and the normal !UN_CACHED VOP_INACTIVE() handler for unionfs finishes it off.
This is easily reproduced by repeated "echo >file; rm file" on a unionfs mount. Strangely, "echo -n >file; rm file" didn't make it happen.
|
124434 |
12-Jan-2004 |
tjr |
Fix an inverted test for NOPEN in the unused function smb_smb_flush().
|
124404 |
11-Jan-2004 |
truckman |
Don't try to unlock the directory vnode in null_lookup() if the lock is shared with the underlying file system and the lookup in the underlying file system did the unlock for us.
|
124326 |
10-Jan-2004 |
tjr |
Restore closing of SMB find handle in smbfs_close().
|
124219 |
07-Jan-2004 |
rwatson |
Lock p->p_textvp before calling vn_fullpath() on it. Note the potential lock order concern due to the vnode lock held simultaneously by the caller into procfs.
Reported by: kuriyama Approved by: des
|
124115 |
04-Jan-2004 |
tjr |
In smbfs_inactive(), only invalidate the node's attribute cache if we had to send a file close request to the server.
|
124090 |
03-Jan-2004 |
tjr |
Pass ACL, extended attribute and MAC vnode ops down the vnode stack.
|
124081 |
02-Jan-2004 |
phk |
Improve on POLA by populating DEVFS before doing devfs(8) rule ioctls.
PR: 60687 Spotted by: Colin Percival <cperciva@daemonology.net>
|
123967 |
29-Dec-2003 |
bde |
Fixed some (most) style bugs in rev.1.33. Mainly 4-char indentation (msdosfs uses normal 8-char indentation almost everywhere else), too-long lines, and minor English usage errors. The verbose formal comment before the new function is still abnormal.
|
123964 |
29-Dec-2003 |
bde |
Fixed some minor style bugs in rev.1.144. All related to msdosfs_advlock() (mainly unsorting). There were no changes related to the dirty flag here. The reference NetBSD implementation put msdosfs_advlock() in a different place. This commit only moves its declarations and changes some of the function body to be like the NetBSD version.
|
123963 |
29-Dec-2003 |
bde |
Fixed style bugs in rev.1.112. The bugs started with obscure magic numbers in comments (Apple PR numbers?) and didn't improve.
|
123932 |
28-Dec-2003 |
bde |
v_vxproc was a bogus name for a thread (pointer).
|
123873 |
26-Dec-2003 |
trhodes |
Make msdosfs support the dirty flag in FAT16 and FAT32. Enable lockf support.
PR: 55861 Submitted by: Jun Su <junsu@m-net.arbornet.org> (original version) Reviewed by: make universe
|
123724 |
22-Dec-2003 |
tjr |
Make oldsize in smbfs_getattr() 64 bits wide instead of 32 to avoid truncation when files are larger than 4GB.
|
123559 |
16-Dec-2003 |
tjr |
Avoid sign extension when casting signed characters to unsigned wide characters in ntfs_u28(). This fixes the conversion of filenames containing single-byte characters with the high bit set.
|
123293 |
08-Dec-2003 |
fjoe |
Make msdosfs long filenames matching case insensitive again.
PR: 59765 Submitted by: Ryuichiro Imura <imura@ryu16.org>
|
123248 |
07-Dec-2003 |
des |
Constify, and add an API function to find a named node in a directory.
|
123247 |
07-Dec-2003 |
des |
Minor whitespace and style issues.
|
123245 |
07-Dec-2003 |
des |
Remove useless SMP check code.
|
123215 |
07-Dec-2003 |
scottl |
Re-arrange and consolidate some random debugging stuff
|
122893 |
19-Nov-2003 |
kan |
Fix vnode locking in fdesc_setattr. Lock vnode before invoking VOP_SETATTR on it.
Approved by: re@ (rwatson)
|
122772 |
16-Nov-2003 |
truckman |
Use "fip->fi_readers == 0 && fip->fi_writers == 0" as the condition for disposing fifo resources in fifo_cleanup() instead using of "vp->v_usecount == 1". There may be other references to the vnode, for instance by nullfs, at the time fifo_open() or fifo_close() is called, which could cause a resource leak.
Don't bother grabbing the vnode interlock in fifo_cleanup() since it no longer accesses v_usecount.
|
122652 |
14-Nov-2003 |
das |
- A sanity check in unionfs verifies that lookups of '.' return the vnode of the parent. However, this check should not be performed if the lookup failed. This change should fix "union_lookup returning . not same as startdir" panics people were seeing. The bug was introduced by an incomplete import of a NetBSD delta in rev 1.38. - Move the aforementioned check out from DIAGNOSTIC. Performance is the least of our unionfs worries. - Minor reorganization.
PR: 53004 MFC after: 1 week
|
122608 |
13-Nov-2003 |
phk |
Initialize b_iooffset correctly.
|
122552 |
12-Nov-2003 |
phk |
Don't mess around with spare fields of public structures.
|
122551 |
12-Nov-2003 |
phk |
Don't mess about with spare fields in public structures.
|
122524 |
12-Nov-2003 |
rwatson |
Modify the MAC Framework so that instead of embedding a (struct label) in various kernel objects to represent security data, we embed a (struct label *) pointer, which now references labels allocated using a UMA zone (mac_label.c). This allows the size and shape of struct label to be varied without changing the size and shape of these kernel objects, which become part of the frozen ABI with 5-STABLE. This opens the door for boot-time selection of the number of label slots, and hence changes to the bound on the number of simultaneous labeled policies at boot-time instead of compile-time. This also makes it easier to embed label references in new objects as required for locking/caching with fine-grained network stack locking, such as inpcb structures.
This change also moves us further in the direction of hiding the structure of kernel objects from MAC policy modules, not to mention dramatically reducing the number of '&' symbols appearing in both the MAC Framework and MAC policy modules, and improving readability.
While this results in minimal performance change with MAC enabled, it will observably shrink the size of a number of critical kernel data structures for the !MAC case, and should have a small (but measurable) performance benefit (i.e., struct vnode, struct socket) do to memory conservation and reduced cost of zeroing memory.
NOTE: Users of MAC must recompile their kernel and all MAC modules as a result of this change. Because this is an API change, third party MAC modules will also need to be updated to make less use of the '&' symbol.
Suggestions from: bmilekic Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
122444 |
10-Nov-2003 |
truckman |
If fifo_open() is interrupted, fifo_close() may not get called, causing a resource leak. Move the resource deallocation code from fifo_close() to a new function, fifo_cleanup(), and call fifo_cleanup() from fifo_close() and the appropriate places in fifo_open().
Tested by: Lukas Ertl Pointy hat to: truckman
|
122352 |
09-Nov-2003 |
tanimura |
- Implement selwakeuppri() which allows raising the priority of a thread being waken up. The thread waken up can run at a priority as high as after tsleep().
- Replace selwakeup()s with selwakeuppri()s and pass appropriate priorities.
- Add cv_broadcastpri() which raises the priority of the broadcast threads. Used by selwakeuppri() if collision occurs.
Not objected in: -arch, -current
|
122102 |
05-Nov-2003 |
scottl |
Add hooks for translating directories entries using the iconv methods.
Submitted by: imura@ryu16.org
|
122101 |
05-Nov-2003 |
scottl |
Add udf_UncompressUnicodeByte() for processing cs0 strings in a way that the iconv mehtods can handle
Submitted by: imura@ryu16.org
|
122091 |
05-Nov-2003 |
kan |
Remove mntvnode_mtx and replace it with per-mountpoint mutex. Introduce two new macros MNT_ILOCK(mp)/MNT_IUNLOCK(mp) to operate on this mutex transparently.
Eventually new mutex will be protecting more fields in struct mount, not only vnode list.
Discussed with: jeff
|
121874 |
02-Nov-2003 |
kan |
Take care not to call vput if thread used in corresponding vget wasn't curthread, i.e. when we receive a thread pointer to use as a function argument. Use VOP_UNLOCK/vrele in these cases.
The only case there td != curthread known at the moment is boot() calling sync with thread0 pointer.
This fixes the panic on shutdown people have reported.
|
121859 |
01-Nov-2003 |
kan |
Remove now unused variable.
|
121847 |
01-Nov-2003 |
kan |
Temporarily undo parts of the stuct mount locking commit by jeff. It is unsafe to hold a mutex across vput/vrele calls.
This will be redone when a better locking strategy is agreed upon.
Discussed with: jeff
|
121842 |
01-Nov-2003 |
kan |
Do not bother walking mount point vnode list just to calculate the number of vnodes. Use precomputed mp->mnt_nvnodelistsize value instead.
|
121281 |
20-Oct-2003 |
phk |
Remember to check the DE_WHITEOUT flag in the case where a cloned device is hidden by a devfs(8) rule.
Spotted by: Adam Nowacki <ptnowak@bsk.vectranet.pl>
|
121270 |
20-Oct-2003 |
phk |
When a driver successfully created a device on demand, we can directly pick up the DEVFS inode number from the dev_t and find our directory entry from that, we don't need to scan the directory to find it.
This also solves an issue with on-demand devices in subdirectories.
Submitted by: cognet
|
121247 |
19-Oct-2003 |
mux |
Remove debug printf().
|
121223 |
18-Oct-2003 |
phk |
Initialize b_iooffset before calling strategy
|
121205 |
18-Oct-2003 |
phk |
DuH!
bp->b_iooffset (the spot on the disk), not bp->b_offset (the offset in the file)
|
121198 |
18-Oct-2003 |
phk |
Initialize b_offset before calling VOP_SPECSTRATEGY()
|
121196 |
18-Oct-2003 |
phk |
Initialize b_offset before calling VOP_STRATEGY/VOP_SPECSTRATEGY.
Remove various comments of KASSERTS and comments about B_PHYS which does not apply anymore.
|
121190 |
18-Oct-2003 |
phk |
Convert some if(bla) panic("foo") to KASSERTS to improve grep-ability.
|
121121 |
15-Oct-2003 |
phk |
Introduce a new optional memberfunction for cdevsw, fdopen() which passes the fdidx from VOP_OPEN down.
This is for all I know the final API for this functionality, but the locking semantics for messing with the filedescriptor from the device driver are not settled at this time.
|
120794 |
05-Oct-2003 |
bde |
Include <sys/mutex.h>. Don't depend on namespace pollution in <sys/vnode.h>.
Fixed a nearby style bug. The include of vcoda.h used angle brackets and was not used.
|
120785 |
05-Oct-2003 |
jeff |
- Check the XLOCK prior to inspecting v_data.
|
120784 |
05-Oct-2003 |
jeff |
- Check XLOCK prior to accessing v_data.
|
120778 |
05-Oct-2003 |
jeff |
- Don't cache_purge() in cd9660_reclaim. vclean() does it for us so this is redundant.
|
120775 |
05-Oct-2003 |
jeff |
- Don't cache_purge() in *_reclaim routines. vclean() does it for us so this is redundant.
|
120770 |
04-Oct-2003 |
alc |
Synchronize access to a vm page's valid field using the containing vm object's lock.
|
120735 |
04-Oct-2003 |
jeff |
- Make proper use of the mntvnode_mtx. We do not need the loop label because we do not drop the mntvnode_mtx. If this code had ever executed and hit the loop condition it would have spun forever.
|
120733 |
04-Oct-2003 |
jeff |
- Acquire the vnode interlock prior to droping the mntvnode_mtx. This does not eliminate races where the vnode could be reclaimed and end up with a NULL v_data pointer but Giant is protecting us from that at the moment.
|
120731 |
04-Oct-2003 |
alc |
Synchronize access to a page's valid field by using the lock from its containing object.
|
120730 |
04-Oct-2003 |
jeff |
- Remove the backtrace() call from the *_vinvalbuf() functions. Thanks to a stack trace supplied by phk, I now understand what's going on here. The check for VI_XLOCK stops us from calling vinvalbuf once the vnode has been partially torn down in vclean(). It is not clear that this would cause a problem. Document this in nfs_bio.c, which is where the other two filesystems copied this code from.
|
120665 |
02-Oct-2003 |
nectar |
Introduce a uiomove_frombuf helper routine that handles computing and validating the offset within a given memory buffer before handing the real work off to uiomove(9).
Use uiomove_frombuf in procfs to correct several issues with integer arithmetic that could result in underflows/overflows. As a side-effect, the code is significantly simplified.
Add additional sanity checks when computing a memory allocation size in pfs_read.
Submitted by: rwatson (original uiomove_frombuf -- bugs are mine :-) Reported by: Joost Pol <joost@pine.nl> (integer underflows/overflows)
|
120583 |
29-Sep-2003 |
rwatson |
Add a new column to the procfs map to hold the name of the mapped file for vnode mappings. Note that this uses vn_fullpath() and may be somewhat unreliable, although not too unreliable for shared libraries. For non-vnode mappings, just print "-" for the field.
Obtained from: TrustedBSD Projects Sponsored by: DARPA, AFRL, Network Associates Laboratories
|
120511 |
27-Sep-2003 |
phk |
forgot to remove static declaration of fdesc_poll()
|
120509 |
27-Sep-2003 |
phk |
fdesc_poll() called seltrue() to do the default thing, this is pointlessly wrong when we have a default in vop_nopoll() which does the right thing.
|
120498 |
27-Sep-2003 |
bde |
Fixed some style bugs in previous commit. Mainly, forward-declare struct msdosfsmount so that this file has the same prerequisites as it used to. The new prerequistite was a meta-style bug. It required many style bugs (unsorted includes ...) elsewhere.
Formatted prototypes in KNF. Resisted urge to sort all the prototypes, to minimise differences with NetBSD. (NetBSD has reformatted the prototypes but has not sorted them and still uses __P(()).)
|
120492 |
26-Sep-2003 |
fjoe |
- Support for multibyte charsets in LIBICONV. - CD9660_ICONV, NTFS_ICONV and MSDOSFS_ICONV kernel options (with corresponding modules). - kiconv(3) for loadable charset conversion tables support.
Submitted by: Ryuichiro Imura <imura@ryu16.org>
|
120471 |
26-Sep-2003 |
tjr |
Allow the [, ], and = characters in non-8.3 filenames since they are allowed by Windows (ref: MS KB article 120138).
XXX From my reading of the CIFS specification, it's not clear that clients need to validate filenames at all.
PR: 57123 Submitted by: Paul Coucher MFC after: 1 month
|
120264 |
19-Sep-2003 |
jeff |
- Remove interlock protection around VI_XLOCK. The interlock is not sufficient to guarantee that this race is not hit. The XLOCK will likely have to be redesigned due to the way reference counting and mutexes work in FreeBSD. We currently can not be guaranteed that xlock was not set and cleared while we were blocked on the interlock while waiting to check for XLOCK. This would lead us to reference a vnode which was not the vnode we requested. - Add a backtrace() call inside of INVARIANTS in the hopes of finding out if this condition is ever hit. It should not, since we should be retaining a reference to the vnode in these cases. The reference would be sufficient to block recycling.
|
120011 |
13-Sep-2003 |
tjr |
Move an overly verbose message under #ifdef CODA_VERBOSE.
|
119942 |
10-Sep-2003 |
tjr |
Move an annoying printf() call that gets triggered every time an operation is interrupted (with ^C or ^Z) under CODA_VERBOSE.
|
119832 |
07-Sep-2003 |
tjr |
Add support for the Coda 6.x venus<->kernel interface. This extends FIDs to be 128-bits wide and adds support for realms.
Add a new CODA_COMPAT_5 option, which requests support for the old Coda 5.x interface instead of the new one.
Create a new coda5.ko module that supports the 5.x interface, and make the existing coda.ko module use the new 6.x interface. These modules cannot both be loaded at the same time.
Obtained from: Jan Harkes & the coda-6.0.2 distribution, NetBSD (drochner) (CODA_COMPAT_5 option).
|
119514 |
28-Aug-2003 |
marcel |
The valid field in struct vm_page can be of type unsigned long when 32K pages are selected. In spec_getpages() change the printf format specifier and add an explicit cast so that we always print the field as a long type.
|
119318 |
22-Aug-2003 |
alc |
Use the requested page's object field instead of the vnode's. In some cases, the vnode's object field is not initialized leading to a NULL pointer dereference when the object is locked.
Tested by: rwatson
|
119122 |
19-Aug-2003 |
des |
Add pfs_visible() checks to pfs_getattr() and pfs_getextattr(). This also fixes pfs_access() since it relies on VOP_GETATTR() which will call pfs_getattr(). This prevents jailed processes from discovering the existence, start time and ownership of processes outside the jail.
PR: kern/48156
|
119091 |
18-Aug-2003 |
jhb |
Spell the name of the lock right in addition to getting the type right.
Submitted by: Kim Culhan <kimc@w8hd.org>
|
119089 |
18-Aug-2003 |
jhb |
The allproc lock is a sx lock, not a mutex, so fix the assertion. This asserts that the sx lock is held, but does not specify if the lock is held shared or exclusive, thus either type of lock satisfies the assertion.
|
119069 |
18-Aug-2003 |
des |
Rework pfs_iterate() a bit to eliminate a bug related to process directories. Previously, pfs_iterate() would return -1 when it reached the end of the process list while processing a process directory node, even if the parent directory contained further nodes (which is the case for the linprocfs root directory, where the process directory node is actually first in the list). With this patch, pfs_iterate() will continue to traverse the parent directory's node list after exhausting the process list (as was the intention all along). The code should hopefully be easier to read as well.
While I'm here, have pfs_iterate() assert that the allproc lock is held.
|
119055 |
17-Aug-2003 |
phk |
Do not call VOP_BMAP() on our own vnodes.
It is particularly silly when all it does is a minor piece of math.
|
118907 |
14-Aug-2003 |
rwatson |
Add p_candebug() check to access a process map file in procfs; limit access to map information for processes that you wouldn't otherwise have debug rights on.
Tested by: bms
|
118837 |
12-Aug-2003 |
trhodes |
Add a '-M mask' option so that users can have different masks for files and directories. This should make some of the Midnight Commander users happy.
Remove an extra ')' in the manual page.
PR: 35699 Submitted by: Eugene Grosbein <eugen@grosbein.pp.ru> (original version) Tested by: simon
|
118607 |
07-Aug-2003 |
jhb |
Consistently use the BSD u_int and u_short instead of the SYSV uint and ushort. In most of these files, there was a mixture of both styles and this change just makes them self-consistent.
Requested by: bde (kern_ktrace.c)
|
118520 |
06-Aug-2003 |
phk |
Don't drop giant around ->d_strategy(), too much code explodes.
|
118463 |
05-Aug-2003 |
phk |
Only drop Giant around the drivers ->d_strategy() if the buffer is not marked to prevent this.
|
118047 |
26-Jul-2003 |
phk |
Add a "int fd" argument to VOP_OPEN() which in the future will contain the filedescriptor number on opens from userland.
The index is used rather than a "struct file *" since it conveys a bit more information, which may be useful to in particular fdescfs and /dev/fd/*
For now pass -1 all over the place.
|
118035 |
26-Jul-2003 |
tjr |
Revise and improve ntfs_subr.c 1.30: read only a single cluster at a time in ntfs_writentvattr_plain and ntfs_readntvattr_plain, and purge the boot block from the buffer cache if isn't exactly one cluster long. These two changes work around the same buffer cache bug that ntfs_subr.c 1.30 tried to, but in a different way. This may decrease throughput by reading smaller amounts of data from the disk at a time, but may increase it by avoiding bogus writes of clean buffers. Problem (re)reported by Karel J. Bosschaart on -current.
|
117949 |
24-Jul-2003 |
peter |
size_t != int. Make this compile on 64 bit platforms (eg: amd64). Also, "u_short value; if (value > 0xffff)" can never be true.
|
117200 |
03-Jul-2003 |
trhodes |
If bread() returns a zero-length buffer, as can happen after a failed write, return an error instead of looping forever.
PR: 37035 Submitted by: das
|
117018 |
29-Jun-2003 |
tjr |
XXX Copy workaround from UFS: open device for write access even if the user requests a read-only mount. This is necessary because we don't do the VOP_OPEN again if they upgrade a read-only mount to read-write.
Fixes lockup when creating files on msdosfs mounts that have been mounted read-only then upgraded to read-write. The exact cause of the lockup is not known, but it is likely to be the kernel getting stuck in an infinite loop trying to write dirty buffers to a device without write permission.
Reported/tested by andreas, discussed with phk.
|
116917 |
27-Jun-2003 |
trhodes |
Fix a bug where a truncate operation involving truncate() or ftruncate() on an MSDOSFS file system either failed, silently corrupted the file, or sometimes corrupted the neighboring file.
PR: 53695 Submitted by: Ariff Abdullah <skywizard@MyBSD.org.my> (original version) MFC: 3 days
|
116796 |
24-Jun-2003 |
jmg |
change dev_t to struct cdev * to match ufs. This fixes fstat for cd9660 and msdosfs.
Reviewed by: bde
|
116678 |
22-Jun-2003 |
phk |
Add a f_vnode field to struct file.
Several of the subtypes have an associated vnode which is used for stuff like the f*() functions.
By giving the vnode a speparate field, a number of checks for the specific subtype can be replaced simply with a check for f_vnode != NULL, and we can later free f_data up to subtype specific use.
At this point in time, f_data still points to the vnode, so any code I might have overlooked will still work.
|
116639 |
20-Jun-2003 |
jmg |
fix grammar in comment
|
116620 |
20-Jun-2003 |
tjr |
Merge from NetBSD src/sys/ntfs/ntfs_subr.c 1.5 & 1.30 (jdolecek): - Avoid calling bread() with different sizes on the same blkno. Although the buffer cache is designed to handle differing size buffers, it erroneously tries to write the incorrectly-sized buffer buffer back to disk before reading the correctly-sized one, even when it's not dirty. This behaviour caused a panic for read-only NTFS mounts when INVARIANTS was enabled ("bundirty: buffer x still on queue y"), reported by NAKAJI Hiroyuki. - Fix a bug in the code handling holes: a variable was incremented instead of decremented, which could cause an infinite loop.
|
116583 |
19-Jun-2003 |
alc |
Lock the vm object when freeing a vm page.
|
116561 |
19-Jun-2003 |
alc |
Lock the vm object when freeing a vm page.
|
116560 |
19-Jun-2003 |
alc |
Lock the vm object when freeing a vm page.
|
116486 |
17-Jun-2003 |
tjr |
Send the close request to the SMB server in smbfs_inactive(), instead of smbfs_close(). This fixes paging to and from mmap()'d regions of smbfs files after the descriptor has been closed, and makes thttpd, GNU ld, and perhaps more things work that depend on being able to do this.
PR: 48291
|
116472 |
17-Jun-2003 |
tjr |
Set f_mntfromname[] to "fdescfs" instead of "fdesc" for consistency with other synthetic filesystems, which have f_mntfromname the same as f_fstypename. Noticed by Sean Kelly on -current.
|
116469 |
17-Jun-2003 |
tjr |
MFp4: Fix two bugs causing possible deadlocks or panics, and one nit: - Emulate lock draining (LK_DRAIN) in null_lock() to avoid deadlocks when the vnode is being recycled. - Don't allow null_nodeget() to return a nullfs vnode from the wrong mount when multiple nullfs's are mounted. It's unclear why these checks were removed in null_subr.c 1.35, but they are definitely necessary. Without the checks, trying to unmount a nullfs mount will erroneously return EBUSY, and forcibly unmounting with -f will cause a panic. - Bump LOG2_SIZEVNODE up to 8, since vnodes are >256 bytes now. The old value (7) didn't cause any problems, but made the hash algorithm suboptimal.
These changes fix nullfs enough that a parallel buildworld succeeds.
Submitted by: tegge (partially; LK_DRAIN) Tested by: kris
|
116447 |
16-Jun-2003 |
truckman |
Partially back out rev 1.87 by nuking fifo_inactive() and moving the resource deallocation back to fifo_close(). This eliminates any stale data that might be stuck in the socket buffers after all the readers and writers have closed the fifo.
Tested by: Thorsten Schroeder <ths@katjusha.de>
|
116418 |
15-Jun-2003 |
phk |
In specfs::vop_specstratey(), assert that the vnode and buffer agree about the device.
|
116416 |
15-Jun-2003 |
phk |
I have not had any reports of trouble for a long time, so remove the gentle versions of the vop_strategy()/vop_specstrategy() mismatch methods and use vop_panic() instead.
|
116414 |
15-Jun-2003 |
phk |
Take 2: Remove _both_ KASSERTS.
|
116413 |
15-Jun-2003 |
phk |
Duh! I misread my handwritte notes: We do _not_ want to asser that vp == bp->b_vp in specfs, that was the entire point of VOP_SPECSTRATEGY().
|
116412 |
15-Jun-2003 |
phk |
Add the same KASSERT to all VOP_STRATEGY and VOP_SPECSTRATEGY implementations to check that the buffer points to the correct vnode.
|
116410 |
15-Jun-2003 |
phk |
Remove in toto coda_strategy which incorrectly implemented vop_panic();
|
116366 |
15-Jun-2003 |
das |
Fix some style problems, some of which are old, some new, and some inherited from UFS.
Requested by: bde, njl
|
116361 |
15-Jun-2003 |
davidxu |
Rename P_THREADED to P_SA. P_SA means a process is using scheduler activations.
|
116358 |
14-Jun-2003 |
das |
If someone tries to mount a union filesystem with another unionfs as the upper layer, fail gracefully instead of panicing.
MFC after: 3 days
|
116357 |
14-Jun-2003 |
das |
Introduce malloc types M_UNDCACHE and M_UNPATH for important unionfs-related data structures to aid in debugging memory leaks. Use NULL and NULLVP instead of 0 as appropriate.
MFC after: 3 days
|
116356 |
14-Jun-2003 |
das |
Factor out the process of freeing ``directory caches'', which unionfs directory vnodes use to refer to their constituent vnodes, into union_dircache_free(). Also s/union_dircache/union_dircache_get/ and tweak the structure of union_dircache_r().
MFC after: 3 days
|
116338 |
14-Jun-2003 |
tjr |
Don't follow smbnode n_parent pointer when NREFPARENT flag is not set in smb_fphelp(): the parent vnode may have already been recycled since we don't hold a reference to it. Fixes a panic when rebooting with mdconfig -t vnode devices referring to vnodes on a smbfs mount.
|
116290 |
13-Jun-2003 |
das |
Plug a serious memory leak. The -STABLE equivalent of this patch has been tested extensively, but -CURRENT testing has been hampered by a number of panics that also occur without the patch. Since the destabilizing changes between 4.X and 5.X are external to unionfs, I believe this patch applies equally well to both.
Thanks to scrappy for assistance testing these and other changes.
MFC after: 4 days
|
116281 |
13-Jun-2003 |
truckman |
Clean up the fifo_open() implementation:
Restructure the error handling portion of the resource allocation code to eliminate duplicated code.
Test for the O_NONBLOCK && fi_readers == 0 case before incrementing fi_writers and modifying the the socket flag to avoid having to undo these operations in this error case.
Restructure and simplify the code that handles blocking opens.
There should be no change to functionality.
|
116271 |
12-Jun-2003 |
phk |
Initialize struct vfsops C99-sparsely.
Submitted by: hmp Reviewed by: phk
|
116181 |
11-Jun-2003 |
obrien |
Use __FBSDID().
|
116173 |
10-Jun-2003 |
obrien |
Use __FBSDID().
|
115609 |
01-Jun-2003 |
truckman |
Don't unlock the parent directory vnode twice if the ISDOTDOT flag is set.
|
115602 |
01-Jun-2003 |
truckman |
Fix up locking problems in fifo_open() and fifo_close():
Sleep on the vnode interlock while waiting for another caller to increment fi_readers or fi_writers. Hold the vnode interlock while incrementing fi_readers or fi_writers to prevent a wakeup from being missed.
Only access fi_readers and fi_writers while holding the vnode lock. Previously fifo_close() decremented their values without holding a lock.
Move resource deallocation from fifo_close() to fifo_inactive(), which allows the VOP_CLOSE() call in the error return path in fifo_open() to be removed. Fifo_open() was calling VOP_CLOSE() with the vnode lock held, in violation the current vnode locking API. Also the way fifo_close() used vrefcnt() to decide whether to deallocate resources was bogus according to comments in the vrefcnt() implementation.
Reviewed by: bde
|
115549 |
31-May-2003 |
phk |
Remove unused variable(s).
Found by: FlexeLint
|
115542 |
31-May-2003 |
phk |
emove unused variable(s).
Found by: FlexeLint
|
115511 |
31-May-2003 |
phk |
Remove unused variable.
Found by: FlexeLint
|
115486 |
31-May-2003 |
phk |
Use temporary variable to avoid double expansion of macro with side effects.
Found by: FlexeLint
|
115485 |
31-May-2003 |
phk |
Remove unused variable.
Found by: FlexeLint
|
114734 |
05-May-2003 |
rwatson |
Clean up proc locking in procfs: make sure the proc lock is held before entering sys_process.c debugging primitives, or we violate assertions. Also, be more careful about releasing the process lock around calls to uiomove() which may sleep waiting for paging machinations or related notions. We may want to defer the uiomove() in at least one case, but jhb will look into that at a later date.
Reported by: Philippe Charnier <charnier@xp11.frmug.org> Reviewed by: jhb
|
114653 |
04-May-2003 |
scottl |
Eliminate the separate malloc type for the sparing table.
|
114652 |
04-May-2003 |
scottl |
Add a missing __inline. Strange that gcc never complained about it. Implement udf_readlblks() in terms of RDSECTOR.
|
114651 |
04-May-2003 |
scottl |
Correctly calculate the size of the extent that should be read in udf_readatoffset(). This should fixe problems with reading udf filesystems created with mkisofs.
|
114632 |
04-May-2003 |
scottl |
Implement the node cache as a hash table.
|
114434 |
01-May-2003 |
des |
Instead of recording the Unix time in a process when it starts, record the uptime. Where necessary, convert it back to Unix time by adding boottime to it. This fixes a potential problem in the accounting code, which would compute the elapsed time incorrectly if the Unix time was stepped during the lifetime of the process.
|
114216 |
29-Apr-2003 |
kan |
Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h>
Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
|
113979 |
24-Apr-2003 |
jhb |
Fail to mount a device if the bytes per sector in the BPB is less than DEV_BSIZE or if the number of FAT sectors is zero.
|
113867 |
22-Apr-2003 |
jhb |
- Always call faultin() in _PHOLD() if PS_INMEM is clear. This closes a race where a thread could assume that a process was swapped in by PHOLD() when it actually wasn't fully swapped in yet. - In faultin(), always msleep() if PS_SWAPPINGIN is set instead of doing this check after bumping p_lock in the PS_INMEM == 0 case. Also, sched_lock is only needed for setting and clearning swapping PS_* flags and the swap thread inhibitor. - Don't set and clear the thread swap inhibitor in the same loops as the pmap_swapin/out_thread() since we have to do it under sched_lock. Instead, mimic the treatment of the PS_INMEM flag and use separate loops to set the inhibitors when clearing PS_INMEM and clear the inhibitors when setting PS_INMEM. - swapout() now returns with the proc lock held as it holds the lock while adjusting the swapping-related PS_* flags so that the proc lock can be used to test those flags. - Only use the proc lock to check the swapping-related PS_* flags in several places. - faultin() no longer requires sched_lock to be held by callers. - Rename PS_SWAPPING to PS_SWAPPINGOUT to be less ambiguous now that we have PS_SWAPPINGIN.
|
113620 |
17-Apr-2003 |
jhb |
- Use a local variable to close a minor race when determining if the wmesg printed out needs a prefix such as when a thread is blocked on a lock. - Use another local variable to close another race for the td_wmesg and td_wchan members of struct thread.
|
113619 |
17-Apr-2003 |
jhb |
Protect p_flag with the proc lock. The sched_lock is not needed to turn off P_STOPPED_SIG in p_flag.
|
113618 |
17-Apr-2003 |
jhb |
- P_SHOULDSTOP just needs proc lock now, so don't acquire sched_lock unless it is needed. - Add a proc lock assertion.
|
113617 |
17-Apr-2003 |
jhb |
Add a proc lock assertion and move another assertion up to the top of the function.
|
113310 |
10-Apr-2003 |
imp |
It appears that msdosfs_init() is called multiple times. This happens on my system where I preload msdosfs and have it in my kernel. There's likely another bug that's causing msdosfs_init() to be called multiple times, but this makes that harmless.
|
112934 |
01-Apr-2003 |
jeff |
- smb_td_intr takes a thread as an argument not a proc.
|
112933 |
01-Apr-2003 |
jeff |
- smb_proc_intr is now spelled smb_td_intr.
Noticed by: phk Pointy hat to: jeffr
|
112916 |
01-Apr-2003 |
tjr |
Specify the M_WAITOK flag explicitly in the MALLOC call to silence a runtime warning ("Bad malloc flags: 0").
|
112915 |
01-Apr-2003 |
tjr |
Give the M_WAITOK flag explicitly to the MALLOC call to silence a runtime warning ("Bad malloc flags: 0").
|
112888 |
31-Mar-2003 |
jeff |
- Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with a follow on commit to kern_sig.c - signotify() now operates on a thread since unmasked pending signals are stored in the thread. - PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.
|
112706 |
27-Mar-2003 |
tjr |
Deregister the dev_clone event handler we registered - don't touch the handlers installed by other devices.
|
112564 |
24-Mar-2003 |
jhb |
Replace the at_fork, at_exec, and at_exit functions with the slightly more flexible process_fork, process_exec, and process_exit eventhandlers. This reduces code duplication and also means that I don't have to go duplicate the eventhandler locking three more times for each of at_fork, at_exec, and at_exit.
Reviewed by: phk, jake, almost complete silence on arch@
|
112529 |
24-Mar-2003 |
bde |
Better fix for the problem addressed by rev.1.79: don't loop in fifo_open() waiting for another reader or writer if one arrived and departed while we were waiting (or a little earlier).
Rev.1.79 broke blocking opens of fifos by making them time out after 1 second. This was bad for at least apsfilter.
Tested by: "Simon 'corecode' Schubert" <corecode@corecode.ath.cx>, Alexander Leidinger <Alexander@leidinger.net>, phk MFC after: 4 weeks
|
112317 |
16-Mar-2003 |
tjr |
Make udf_allocv() return an unlocked vnode instead of a locked one to avoid a "locking against myself" panic when udf_hashins() tries to lock it again. Lock the vnode in udf_hashins() before adding it to the hash bucket.
|
112183 |
13-Mar-2003 |
jeff |
- Add a lock for protecting against msleep(bp, ...) wakeup(bp) races. - Create a new function bdone() which sets B_DONE and calls wakup(bp). This is suitable for use as b_iodone for buf consumers who are not going through the buf cache. - Create a new function bwait() which waits for the buf to be done at a set priority and with a specific wmesg. - Replace several cases where the above functionality was implemented without locking with the new functions.
|
112119 |
11-Mar-2003 |
kan |
Rename vfs_stdsync function to vfs_stdnosync which matches more closely what function is really doing. Update all existing consumers to use the new name.
Introduce a new vfs_stdsync function, which iterates over mount point's vnodes and call FSYNC on each one of them in turn.
Make nwfs and smbfs use this new function instead of rolling their own identical sync implementations.
Reviewed by: jeff
|
111960 |
07-Mar-2003 |
tjr |
Set f_fstypename in coda_nb_statfs().
|
111945 |
06-Mar-2003 |
tjr |
Add a temporary workaround for a deadlock in Coda venus 5.3.19 that occurs when mounting the filesystem. The problem is that venus issues the mount() syscall, which calls vfs_mount(), which calls coda_root() which attempts to communicate with venus.
|
111944 |
06-Mar-2003 |
tjr |
Remove fragments of support for the FreeBSD 3.x and 4.x branches.
|
111931 |
05-Mar-2003 |
tjr |
VOP_PATHCONF returns a register_t, not an int. Noticed by phk.
|
111908 |
05-Mar-2003 |
tjr |
Add prototype for coda_pathconf() that I missed in the previous commit.
|
111903 |
05-Mar-2003 |
tjr |
Add a minimal implementation of VOP_PATHCONF to silence warning messages from ls(1).
|
111902 |
05-Mar-2003 |
tjr |
Handle the case where a_uio->uio_td == NULL properly in coda_readlink(). This happens when called from lookup().
|
111856 |
04-Mar-2003 |
jeff |
- Add a new 'flags' parameter to getblk(). - Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT flag to the initial BUF_LOCK(). This will eventually be used in cases were we want to use a buffer only if it is not currently in use. - Convert all consumers of the getblk() api to use this extra parameter.
Reviwed by: arch Not objected to by: mckusick
|
111841 |
03-Mar-2003 |
njl |
Finish cleanup of vprint() which was begun with changing v_tag to a string. Remove extraneous uses of vop_null, instead defering to the default op. Rename vnode type "vfs" to the more descriptive "syncer". Fix formatting for various filesystems that use vop_print.
|
111821 |
03-Mar-2003 |
phk |
Make nokqfilter() return the correct return value.
Ditch the D_KQFILTER flag which was used to prevent calling NULL pointers.
|
111815 |
03-Mar-2003 |
phk |
Gigacommit to improve device-driver source compatibility between branches:
Initialize struct cdevsw using C99 sparse initializtion and remove all initializations to default values.
This patch is automatically generated and has been tested by compiling LINT with all the fields in struct cdevsw in reverse order on alpha, sparc64 and i386.
Approved by: re(scottl)
|
111769 |
02-Mar-2003 |
des |
Get rid of caddr_t.
|
111748 |
02-Mar-2003 |
des |
More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9).
|
111742 |
02-Mar-2003 |
des |
Clean up whitespace, s/register //, refrain from strong urge to ANSIfy.
|
111741 |
02-Mar-2003 |
des |
uiomove-related caddr_t -> void * (just the low-hanging fruit)
|
111738 |
02-Mar-2003 |
des |
wakeup(9) and msleep(9) take void * arguments, not caddr_t.
|
111730 |
02-Mar-2003 |
phk |
NODEVFS cleanup:
Replace devfs_{create,destroy} hooks with direct function calls.
|
111611 |
27-Feb-2003 |
tjr |
Copy some VM changes from smbfs_putpages() to nwfs_putpages(): lock page queues, use vm_page_undirty().
|
111603 |
27-Feb-2003 |
tjr |
Fix vnode corruption bug when trying to rename files across filesystems. Similar to the bug fixed in smbfs_vnops.c rev 1.33.
|
111601 |
27-Feb-2003 |
tjr |
Sync nwfs_access() with smbfs_access(): use vaccess() instead of checking permissions ourself, fixes problem with VAPPEND.
|
111597 |
27-Feb-2003 |
tjr |
Catch up with recent netncp changes: ncp_chkintr() takes a thread, not a proc, as its second argument.
|
111585 |
27-Feb-2003 |
julian |
Change the process flags P_KSES to be P_THREADED. This is just a cosmetic change but I've been meaning to do it for about a year.
|
111573 |
26-Feb-2003 |
phk |
msg
|
111127 |
19-Feb-2003 |
tjr |
Do not call smbfs_attr_cacheremove() in the EXDEV case in smbfs_rename(). One of the vnodes is on different mount and is possibly on a different kind of filesystem; treating it as an smbfs vnode then writing to it will probably corrupt it.
PR: 48381 MFC after: 1 month
|
111119 |
19-Feb-2003 |
imp |
Back out M_* changes, per decision of the TRB.
Approved by: trb
|
110700 |
11-Feb-2003 |
phk |
Use the SI_CANDELETE flag on the dev_t rather than the D_CANFREE flag on the cdevsw to determine ability to handle the BIO_DELETE request.
|
110584 |
09-Feb-2003 |
jeff |
- Cleanup unlocked accesses to buf flags by introducing a new b_vflag member that is protected by the vnode lock. - Move B_SCANNED into b_vflags and call it BV_SCANNED. - Create a vop_stdfsync() modeled after spec's sync. - Replace spec_fsync, msdos_fsync, and hpfs_fsync with the stdfsync and some fs specific processing. This gives all of these filesystems proper behavior wrt MNT_WAIT/NOWAIT and the use of the B_SCANNED flag. - Annotate the locking in buf.h
|
110533 |
08-Feb-2003 |
tjr |
Revert removal of vnode and VFS stubs; bp asserts that they are needed.
|
110501 |
07-Feb-2003 |
tjr |
Garbage-collect stub VFS ops, use the defaults instead.
|
110500 |
07-Feb-2003 |
tjr |
Garbage-collect stub vnode ops, use the defaults instead.
|
110314 |
04-Feb-2003 |
tjr |
Add missing permission checks to the smbfs VOP_SETATTR vnode op for the case where the caller requests to change access or modification times.
MFC after: 3 days
|
110299 |
03-Feb-2003 |
phk |
Split the global timezone structure into two integer fields to prevent the compiler from optimizing assignments into byte-copy operations which might make access to the individual fields non-atomic.
Use the individual fields throughout, and don't bother locking them with Giant: it is no longer needed.
Inspired by: tjr
|
110272 |
03-Feb-2003 |
tjr |
Use vaccess() instead of rolling our own access checks. This fixes a bug where requests to open a file in append mode were always denied, and will also be useful when capabilities and auditing are implemented.
|
110063 |
29-Jan-2003 |
phk |
NODEVFS cleanup: remove #ifdefs.
|
110043 |
29-Jan-2003 |
tjr |
Escape the backslash in badchars so that smbfs_pathcheck() correctly rejects pathnames with backslashes in them (and to avoid a syntax error).
Found by: FlexeLint
|
109969 |
28-Jan-2003 |
tjr |
Do not allow a cached vnode to be shared among multiple mounts of the same kind of pseudofs-based filesystem. Fixes (at least) one problem where when procfs is mounted mupltiple times, trying to unmount one will often cause the wrong one to get unmounted, and other problem where mounting one procfs on top of another caused the kernel to lock up.
Reviewed by: des
|
109623 |
21-Jan-2003 |
alfred |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
109608 |
21-Jan-2003 |
rwatson |
GC an unused reference to vop_refreshlabel_desc; reference to opt_mac.h was removed previously so it was never compiled in.
Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
109526 |
19-Jan-2003 |
phk |
Originally when DEVFS was added, a global variable "devfs_present" was used to control code which were conditional on DEVFS' precense since this avoided the need for large-scale source pollution with #include "opt_geom.h"
Now that we approach making DEVFS standard, replace these tests with an #ifdef to facilitate mechanical removal once DEVFS becomes non-optional.
No functional change by this commit.
|
109450 |
18-Jan-2003 |
tjr |
Fake up a struct componentname to pass to VOP_WHITEOUT instead of passing NULL. union_whiteout() expects the componentname argument to be non-NULL. Fixes a NULL dereference panic when an existing union mount becomes the upper layer of a new union mount.
|
109202 |
13-Jan-2003 |
phk |
Even if the permissions deny it, a process should be allowed to access its controlling terminal.
In essense, history dictates that any process is allowed to open /dev/tty for RW, irrespective of credential, because by definition it is it's own controlling terminal.
Before DEVFS we relied on a hacky half-device thing (kern/tty_tty.c) which did the magic deep down at device level, which at best was disgusting from an architectural point of view.
My first shot at this was to use the cloning mechanism to simply give people the right tty when they ask for /dev/tty, that's why you get this, slightly counter intuitive result:
syv# ls -l /dev/tty `tty` crw--w---- 1 u1 tty 5, 0 Jan 13 22:14 /dev/tty crw--w---- 1 u1 tty 5, 0 Jan 13 22:14 /dev/ttyp0
Trouble is, when user u1 su(1)'s to user u2, he cannot open /dev/ttyp0 anymore because he doesn't have permission to do so.
The above fix allows him to do that.
The interesting side effect is that one was previously only able to access the controlling tty by indirection: date > /dev/tty but not by name: date > `tty`
This is now possible, and that feels a lot more like DTRT.
PR: 46635 MFC candidate: could be.
|
109153 |
13-Jan-2003 |
dillon |
Bow to the whining masses and change a union back into void *. Retain removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.
|
109123 |
12-Jan-2003 |
dillon |
Change struct file f_data to un_data, a union of the correct struct pointer types, and remove a huge number of casts from code using it.
Change struct xfile xf_data to xun_data (ABI is still compatible).
If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.
|
109090 |
11-Jan-2003 |
dd |
Add symlink support to devfs_rule_matchpath(). This allows the user to unhide symlinks as well as hide them.
|
108716 |
05-Jan-2003 |
phk |
Don't override the vop_lock, vop_unlock and vop_isunlocked methods.
Previously all filesystems which relied on specfs to do devices would have private overrides for vop_std*, so the vop_no* overrides here had no effect. I overlooked the transitive nature of the vop vectors when I removed the vop_std* in those filesystems.
Removing the override here restores device node locking to it's previous modus operandi.
Spotted by: bde
|
108707 |
05-Jan-2003 |
phk |
Don't take the detour over VOP_STRATEGY from spec_getpages, call our own strategy directly.
|
108706 |
05-Jan-2003 |
phk |
Split out the vnode and buf arguments to the internal strategy worker routine instead of doing evil casts.
|
108692 |
05-Jan-2003 |
tjr |
Repair vnode locking in portal_lookup(). Specifically, lock the file vnode, and unlock the parent directory vnode if LOCKPARENT is not set.
Obtained from: NetBSD (rev. 1.34)
|
108686 |
04-Jan-2003 |
phk |
Temporarily introduce a new VOP_SPECSTRATEGY operation while I try to sort out disk-io from file-io in the vm/buffer/filesystem space.
The intent is to sort VOP_STRATEGY calls into those which operate on "real" vnodes and those which operate on VCHR vnodes. For the latter kind, the call will be changed to VOP_SPECSTRATEGY, possibly conditionally for those places where dual-use happens.
Add a default VOP_SPECSTRATEGY method which will call the normal VOP_STRATEGY. First time it is called it will print debugging information. This will only happen if a normal vnode is passed to VOP_SPECSTRATEGY by mistake.
Add a real VOP_SPECSTRATEGY in specfs, which does what VOP_STRATEGY does on a VCHR vnode today.
Add a new VOP_STRATEGY method in specfs to catch instances where the conversion to VOP_SPECSTRATEGY has not yet happened. Handle the request just like we always did, but first time called print debugging information.
Apart up to two instances of console messages per boot, this amounts to a glorified no-op commit.
If you get any of the messages on your console I would very much like a copy of them mailed to phk@freebsd.org
|
108681 |
04-Jan-2003 |
phk |
resort vnode ops list
|
108658 |
04-Jan-2003 |
phk |
Replace spec_bmap() with vop_panic: We should never BMAP a device backed vnode only filesystem backed vnodes.
|
108648 |
04-Jan-2003 |
phk |
Since Jeffr made the std* functions the default in rev 1.63 of kern/vfs_defaults.c it is wrong for the individual filesystems to use the std* functions as that prevents override of the default.
Found by: src/tools/tools/vop_table
|
108589 |
03-Jan-2003 |
phk |
Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since all BUF_STRATEGY did in the first place was call VOP_STRATEGY.
|
108586 |
03-Jan-2003 |
phk |
Remove unused second argument from DEV_STRATEGY().
|
108470 |
30-Dec-2002 |
schweikh |
Fix typos, mostly s/ an / a / where appropriate and a few s/an/and/ Add FreeBSD Id tag where missing.
|
108387 |
29-Dec-2002 |
phk |
There is some sort of race/deadlock which I have not identified here. It manifests itself by sendmail hanging in "fifoow" during boot on a diskless machine with sendmail disabled.
Giving the sleep a 1sec timout breaks the deadlock, but does not solve the underlying problem.
XXX comment applied.
|
108357 |
28-Dec-2002 |
dillon |
Abstract-out the constants for the sequential heuristic.
No operational changes.
MFC after: 1 day
|
108341 |
28-Dec-2002 |
rwatson |
Trim left-over and unused vop_refreshlabel() bits from devfs.
Reported by: bde
|
107890 |
15-Dec-2002 |
tjr |
Remove redundant check for negative or zero v_usecount; vrele() already checks that.
|
107842 |
13-Dec-2002 |
tjr |
Keep trying to flush the vnode list for the mount while some are still busy and we are making progress towards making them not busy. This is needed because smbfs vnodes reference their parent directory but may appear after their parent in the mount's vnode list; one pass over the list is not sufficient in this case.
This stops attempts to unmount idle smbfs mounts failing with EBUSY.
|
107822 |
13-Dec-2002 |
tjr |
Fix build with SMB_VNODE_DEBUG defined; use td_proc->p_pid instead of the nonexistent td_pid.
|
107821 |
13-Dec-2002 |
tjr |
Store a reference to the parent directory's vnode in struct smbnode, not to the parent's smbnode, which may be freed during the lifetime of the child if the mount is forcibly unmounted. umount -f should now work properly (ie. not panic) on smbfs mounts.
|
107698 |
09-Dec-2002 |
rwatson |
Remove dm_root entry from struct devfs_mount. It's never set, and is unused. Replace it with a dm_mount back-pointer to the struct mount that the devfs_mount is associated with. Export that pointer to MAC Framework entry points, where all current policies don't use the pointer. This permits the SEBSD port of SELinux's FLASK/TE to compile out-of-the-box on 5.0-CURRENT with full file system labeling support.
Approved by: re (murray) Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
106696 |
09-Nov-2002 |
alfred |
Fix instances of macros with improperly parenthasized arguments.
Verified by: md5
|
106595 |
07-Nov-2002 |
jhb |
Cast a pointer to a uintptr_t to quiet a warning.
|
106594 |
07-Nov-2002 |
jhb |
Third argument to copyinstr() is a pointer to a size_t, not a pointer to a u_int.
|
106402 |
04-Nov-2002 |
mckusick |
Add debug.doslowdown to enable/disable niced slowdown on I/O. Default to off until locking interference issues get sorted out.
Sponsored by: DARPA & NAI Labs.
|
106355 |
02-Nov-2002 |
peter |
Unbreak MNT_UPDATE when running with cd as root. Detect mountroot by checking for "path == NULL" (like ffs) rather than MNT_ROOT. Otherwise when you try and do an update or mountd does an NFS export, the remount fails because the code tries to mount a fresh rootfs and gets an EBUSY. The same bug is in 4.x (which is where I found it).
Sanity check by: mux
|
106298 |
01-Nov-2002 |
phk |
Put a KASSERT in specfs::strategy() to check that the incoming buffer has a valid b_iocmd. Valid is any one of BIO_{READ,WRITE,DELETE}.
I have seen at least one case where the bio_cmd field was zero once the request made it into GEOM. Putting the KASSERT here allows us to spot the culprit in the backtrace.
|
106110 |
29-Oct-2002 |
semenu |
Fix winChkName() to match when the last slot contains nothing but the terminating zero (it was treated as length missmatch). The mtools create such slots if the name len is the product of 13 (max number of unicode chars fitting in directory slot).
MFC after: 1 week
|
105998 |
26-Oct-2002 |
mux |
In VOP_LOOKUP, don't deny DELETE and RENAME operations when ISLASTCN is not set. The actual file which is being looked up may live in a different filesystem.
|
105988 |
26-Oct-2002 |
rwatson |
Slightly change the semantics of vnode labels for MAC: rather than "refreshing" the label on the vnode before use, just get the label right from inception. For single-label file systems, set the label in the generic VFS getnewvnode() code; for multi-label file systems, leave the labeling up to the file system. With UFS1/2, this means reading the extended attribute during vfs_vget() as the inode is pulled off disk, rather than hitting the extended attributes frequently during operations later, improving performance. This also corrects sematics for shared vnode locks, which were not previously present in the system. This chances the cache coherrency properties WRT out-of-band access to label data, but in an acceptable form. With UFS1, there is a small race condition during automatic extended attribute start -- this is not present with UFS2, and occurs because EAs aren't available at vnode inception. We'll introduce a work around for this shortly.
Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
105902 |
25-Oct-2002 |
mckusick |
Within ufs, the ffs_sync and ffs_fsync functions did not always check for and/or report I/O errors. The result is that a VFS_SYNC or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in the presence of a hard error writing a disk sector or in a filesystem full condition. This patch ensures that I/O errors will always be checked and returned. This patch also ensures that every call to VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes appropriate action when an error is returned.
Sponsored by: DARPA & NAI Labs.
|
105667 |
22-Oct-2002 |
mckusick |
This checkin reimplements the io-request priority hack in a way that works in the new threaded kernel. It was commented out of the disksort routine earlier this year for the reasons given in kern/subr_disklabel.c (which is where this code used to reside before it moved to kern/subr_disk.c):
---------------------------- revision 1.65 date: 2002/04/22 06:53:20; author: phk; state: Exp; lines: +5 -0 Comment out Kirks io-request priority hack until we can do this in a civilized way which doesn't cause grief.
The problem is that it is not generally safe to cast a "struct bio *" to a "struct buf *". Things like ccd, vinum, ata-raid and GEOM constructs bio's which are not entrails of a struct buf.
Also, curthread may or may not have anything to do with the I/O request at hand.
The correct solution can either be to tag struct bio's with a priority derived from the requesting threads nice and have disksort act on this field, this wouldn't address the "silly-seek syndrome" where two equal processes bang the diskheads from one edge to the other of the disk repeatedly.
Alternatively, and probably better: a sleep should be introduced either at the time the I/O is requested or at the time it is completed where we can be sure to sleep in the right thread.
The sleep also needs to be in constant timeunits, 1/hz can be practicaly any sub-second size, at high HZ the current code practically doesn't do anything. ----------------------------
As suggested in this comment, it is no longer located in the disk sort routine, but rather now resides in spec_strategy where the disk operations are being queued by the thread that is associated with the process that is really requesting the I/O. At that point, the disk queues are not visible, so the I/O for positively niced processes is always slowed down whether or not there is other activity on the disk.
On the issue of scaling HZ, I believe that the current scheme is better than using a fixed quantum of time. As machines and I/O subsystems get faster, the resolution on the clock also rises. So, ten years from now we will be slowing things down for shorter periods of time, but the proportional effect on the system will be about the same as it is today. So, I view this as a feature rather than a drawback. Hence this patch sticks with using HZ.
Sponsored by: DARPA & NAI Labs. Reviewed by: Poul-Henning Kamp <phk@critter.freebsd.dk>
|
105655 |
21-Oct-2002 |
jhb |
Grrr, s/PBP/BPB/ here as well.
Noticed by: peter
|
105645 |
21-Oct-2002 |
jhb |
Spell the BPB member of the 7.10 bootsector as bsBPB rather than bsPBP to be like all the other bootsectors. Apple has done the same it seems.
|
105585 |
20-Oct-2002 |
rwatson |
Missed a case of _POSIX_MAC_PRESENT -> _PC_MAC_PRESENT rename.
Pointed out by: phk
|
105561 |
20-Oct-2002 |
phk |
'&' not used for pointers to functions.
Spotted by: FlexeLint
|
105560 |
20-Oct-2002 |
phk |
Remove even more '&' from pointers to functions.
Spotted by: FlexeLint
|
105488 |
19-Oct-2002 |
kan |
umap_sync is empty and is identical to vfs_stdsync. Remove it and use generic function instead.
Approved by: obrien
|
105487 |
19-Oct-2002 |
kan |
style(9)
Approved by: obrien
|
105212 |
16-Oct-2002 |
phk |
Fix comments and one resulting code confusion about the type of the "command" argument to VOP_IOCTL.
Spotted by: FlexeLint.
|
105211 |
16-Oct-2002 |
phk |
Be consistent about functions being static.
Spotted by: FlexeLint
|
105210 |
16-Oct-2002 |
phk |
A better solution to avoiding variable sized structs in DEVFS.
|
105209 |
16-Oct-2002 |
phk |
#include "opt_devfs.h" to protect against variable sized structures.
Spotted by: FlexeLint
|
105165 |
15-Oct-2002 |
phk |
Plug an infrequent (I think) memory leak.
Spotted by: FlexeLint
|
105077 |
14-Oct-2002 |
mckusick |
Regularize the vop_stdlock'ing protocol across all the filesystems that use it. Specifically, vop_stdlock uses the lock pointed to by vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to reference vp->v_lock. Filesystems that wish to use the default do not need to allocate a lock at the front of their node structure (as some still did) or do a lockinit. They can simply start using vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks, but still use the vop_stdlock functions (such as nullfs) can simply replace vp->v_vnlock with a pointer to the lock that they wish to have used for the vnode. Such filesystems are responsible for setting the vp->v_vnlock back to the default in their vop_reclaim routine (e.g., vp->v_vnlock = &vp->v_lock).
In theory, this set of changes cleans up the existing filesystem lock interface and should have no function change to the existing locking scheme.
Sponsored by: DARPA & NAI Labs.
|
105051 |
13-Oct-2002 |
mux |
- Remove a useless initialization for 'ronly', if it hadn't been there, we would have noticed that 'ronly' was uninitialized :-). - Kill a nearby 'register' keyword.
|
105050 |
13-Oct-2002 |
phk |
Pass flags to VOP_CLOSE() corresponding to what was passed to VOP_OPEN().
Submitted by: "Peter Edwards" <pmedwards@eircom.net>
|
104908 |
11-Oct-2002 |
mike |
Change iov_base's type from `char *' to the standard `void *'. All uses of iov_base which assume its type is `char *' (in order to do pointer arithmetic) have been updated to cast iov_base to `char *'.
|
104653 |
08-Oct-2002 |
dd |
Treat the pathptrn field as a real pattern with the aid of fnmatch().
|
104566 |
06-Oct-2002 |
mux |
Yet another 64 bits warning fix: s/u_int/size_t/.
|
104565 |
06-Oct-2002 |
mux |
Fix a warning on 64 bits platforms: copyinstr() takes a size_t *, not an u_int *.
|
104564 |
06-Oct-2002 |
mux |
Fix a warning on 64 bits platforms: copystr() takes a size_t *, not an int *.
|
104533 |
05-Oct-2002 |
rwatson |
Integrate a devfs/MAC fix from the MAC tree: avoid a race condition during devfs VOP symlink creation by introducing a new entry point to determine the label of the devfs_dirent prior to allocation of a vnode for the symlink.
Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
104508 |
05-Oct-2002 |
phk |
Plug memoryleaks detected by FlexeLint.
|
104306 |
01-Oct-2002 |
jmallett |
Back our kernel support for reliable signal queues.
Requested by: rwatson, phk, and many others
|
104278 |
01-Oct-2002 |
phk |
Move the vop-vector declaration into devfs_vnops.c where it belongs.
|
104264 |
01-Oct-2002 |
jmallett |
When working with sigset_t's, and needing to perform masking operations based on a process's pending signals, use the signal queue flattener, ksiginfo_to_sigset_t, on the process, and on a local sigset_t, and then work with that as needed.
|
104233 |
30-Sep-2002 |
jmallett |
First half of implementation of ksiginfo, signal queues, and such. This gets signals operating based on a TailQ, and is good enough to run X11, GNOME, and do job control. There are some intricate parts which could be more refined to match the sigset_t versions, but those require further evaluation of directions in which our signal system can expand and contract to fit our needs.
After this has been in the tree for a while, I will make in kernel API changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo more robustly, such that we can actually pass information with our (queued) signals to the userland. That will also result in using a struct ksiginfo pointer, rather than a signal number, in a lot of kern_sig.c, to refer to an individual pending signal queue member, but right now there is no defined behaviour for such.
CODAFS is unfinished in this regard because the logic is unclear in some places.
Sponsored by: New Gold Technology Reviewed by: bde, tjr, jake [an older version, logic similar]
|
104113 |
28-Sep-2002 |
phk |
s/struct dev_t */dev_t */
|
104099 |
28-Sep-2002 |
phk |
Fix mis-indent.
|
104094 |
28-Sep-2002 |
phk |
Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too.
Inspired by: FlexeLint warning #512
|
104089 |
28-Sep-2002 |
phk |
I misplaced a local variable yesterday.
|
104048 |
27-Sep-2002 |
phk |
Add a D_NOGIANT flag which can be set in a struct cdevsw to indicate that a particular device driver is not Giant-challenged.
SPECFS will DROP_GIANT() ... PICKUP_GIANT() around calls to the driver in question.
Notice that the interrupt path is not affected by this!
This does _NOT_ work for drivers accessed through cdevsw->d_strategy() ie drivers for disk(-like), some tapes, maybe others.
|
104043 |
27-Sep-2002 |
phk |
Rename struct specinfo to the more appropriate struct cdev.
Agreed on: jake, rwatson, jhb
|
104012 |
26-Sep-2002 |
phk |
I hate it when patch gives me .rej files.
Can't we make the pre-commit check refuse if there are .rej files in the directory ?
|
104007 |
26-Sep-2002 |
phk |
Return ENOTTY on unhandled ioctls.
|
104005 |
26-Sep-2002 |
phk |
Return ENOTTY on unrecognized ioctls.
|
104004 |
26-Sep-2002 |
phk |
Return ENOTTY on incorrect ioctls.
|
104003 |
26-Sep-2002 |
phk |
Return ENOTTY when we don't recognize an ioctl.
|
103989 |
26-Sep-2002 |
njl |
Fix these warns where sizeof(int) != sizeof(void *) /h/des/src/sys/coda/coda_venus.c: In function `venus_ioctl': /h/des/src/sys/coda/coda_venus.c:277: warning: cast from pointer to integer of different size /h/des/src/sys/coda/coda_venus.c:292: warning: cast from pointer to integer of different size /h/des/src/sys/coda/coda_venus.c: In function `venus_readlink': /h/des/src/sys/coda/coda_venus.c:380: warning: cast from pointer to integer of different size /h/des/src/sys/coda/coda_venus.c: In function `venus_readdir': /h/des/src/sys/coda/coda_venus.c:637: warning: cast from pointer to integer of different size
Submitted by: des-alpha-tinderbox
|
103983 |
26-Sep-2002 |
jeff |
- Fix a botch in previous commit; oldvp should not be unconditionally assigned.
|
103979 |
25-Sep-2002 |
semenu |
Fix the problem introduced by vop_stdbmap() usage. The NTFS does not implement worthful VOP_BMAP() handler, so it expect the blkno not to be changed by VOP_BMAP(). Otherwise, it'll have to find some tricky way to determine if bp was VOP_BMAP()ed or not in VOP_STRATEGY().
PR: kern/42139
|
103942 |
25-Sep-2002 |
jeff |
- Use vrefcnt() instead of v_usecount.
|
103937 |
25-Sep-2002 |
jeff |
- Use vrefcnt() instead of directly accessing v_usecount.
|
103936 |
25-Sep-2002 |
jeff |
- Use vrefcnt() where it is safe to do so instead of doing direct and unlocked accesses to v_usecount. - Lock access to the buf lists in the various sync routines. interlock locking could be avoided almost entirely in leaf filesystems if the fsync function had a generic helper.
|
103935 |
25-Sep-2002 |
jeff |
- Lock access to the buf lists in spec_sync() - Fixup interlock locking in spec_close()
|
103934 |
25-Sep-2002 |
jeff |
- Hold the vp lock while accessing v_vflags.
|
103870 |
23-Sep-2002 |
alfred |
use __packed.
|
103804 |
22-Sep-2002 |
iedowse |
Attempt to fix the error reported by the alpha tinderbox. A pointer was being cast to an integer as part of a hash function, so just add an intptr_t cast to silence the warning.
|
103796 |
22-Sep-2002 |
truckman |
Fix misspellings, capitalization, and punctuation in comments. Minor comment phrasing and style changes.
|
103767 |
21-Sep-2002 |
jake |
Use the fields in the sysentvec and in the vm map header in place of the constants VM_MIN_ADDRESS, VM_MAXUSER_ADDRESS, USRSTACK and PS_STRINGS. This is mainly so that they can be variable even for the native abi, based on different machine types. Get stack protections from the sysentvec too. This makes it trivial to map the stack non-executable for certain abis, on machines that support it.
|
103636 |
19-Sep-2002 |
truckman |
VOP_FSYNC() requires that it's vnode argument be locked, which nfs_link() wasn't doing. Rather than just lock and unlock the vnode around the call to VOP_FSYNC(), implement rwatson's suggestion to lock the file vnode in kern_link() before calling VOP_LINK(), since the other filesystems also locked the file vnode right away in their link methods. Remove the locking and and unlocking from the leaf filesystem link methods.
Reviewed by: rwatson, bde (except for the unionfs_link() changes)
|
103559 |
18-Sep-2002 |
njl |
Remove any VOP_PRINT that redundantly prints the tag. Move lockmgr_printinfo() into vprint() for everyone's benefit.
Suggested by: bde
|
103537 |
18-Sep-2002 |
bp |
Always open file in the DENYNONE mode and let the server to decide what is good for this file. This should allow read only access to file which is already opened on server.
|
103533 |
18-Sep-2002 |
bp |
Implement additional SMB calls to allow proper update of file size as some file servers fail to do it in the right way.
New NFLUSHWIRE flag marks pending flush request(s).
NB: not all cases covered by this commit.
Obtained from: Darwin
|
103314 |
14-Sep-2002 |
njl |
Remove all use of vnode->v_tag, replacing with appropriate substitutes. v_tag is now const char * and should only be used for debugging.
Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP.
Suggested by: phk Reviewed by: bde, rwatson (earlier version)
|
103216 |
11-Sep-2002 |
julian |
Completely redo thread states.
Reviewed by: davidxu@freebsd.org
|
102950 |
05-Sep-2002 |
davidxu |
s/SGNL/SIG/ s/SNGL/SINGLE/ s/SNGLE/SINGLE/
Fix abbreviation for P_STOPPED_* etc flags, in original code they were inconsistent and difficult to distinguish between them.
Approved by: julian (mentor)
|
102821 |
01-Sep-2002 |
iedowse |
Add a missing #include <sys/lockmgr.h>.
|
102412 |
25-Aug-2002 |
charnier |
Replace various spelling with FALLTHROUGH which is lint()able
|
102392 |
25-Aug-2002 |
bde |
Fixed printf format errors and style bugs in rev.1.92. This is the version that should have been committed in rev.1.93.
|
102391 |
25-Aug-2002 |
bde |
Oops, the previous commit wasn't the version that I meant to commit (it does some extra things which are probably harmless). Back it out.
|
102385 |
25-Aug-2002 |
bde |
Fixed printf format errors and style bugs in previous commit.
|
102314 |
23-Aug-2002 |
scottl |
Remove stddef.h from the header list
Prodded by: peter
|
102295 |
22-Aug-2002 |
trhodes |
Fix a bug where large msdos partitions were not handled correctly, and fix a few fsck_msdosfs related 'issues'
PR: 28536, 30168 Submitted by: Jiangyi Liu <jyliu@163.net> && NetBSD Approved by: rwatson (mentor)
|
102170 |
20-Aug-2002 |
scottl |
Remove the possibility of a race condition when reading the . and .. entries.
|
102169 |
20-Aug-2002 |
scottl |
Don't abuse the stack when translating names.
|
102160 |
20-Aug-2002 |
rwatson |
Handle one more case of a fifofs filetmp: set filetmp.f_cred to ap->a_cred, and pass in ap->a_td->td_ucred as the active_cred to soo_poll().
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
102003 |
17-Aug-2002 |
rwatson |
In continuation of early fileop credential changes, modify fo_ioctl() to accept an 'active_cred' argument reflecting the credential of the thread initiating the ioctl operation.
- Change fo_ioctl() to accept active_cred; change consumers of the fo_ioctl() interface to generally pass active_cred from td->td_ucred. - In fifofs, initialize filetmp.f_cred to ap->a_cred so that the invocations of soo_ioctl() are provided access to the calling f_cred. Pass ap->a_td->td_ucred as the active_cred, but note that this is required because we don't yet distinguish file_cred and active_cred in invoking VOP's. - Update kqueue_ioctl() for its new argument. - Update pipe_ioctl() for its new argument, pass active_cred rather than td_ucred to MAC for authorization. - Update soo_ioctl() for its new argument. - Update vn_ioctl() for its new argument, use active_cred rather than td->td_ucred to authorize VOP_IOCTL() and the associated VOP_GETATTR().
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
101983 |
16-Aug-2002 |
rwatson |
Make similar changes to fo_stat() and fo_poll() as made earlier to fo_read() and fo_write(): explicitly use the cred argument to fo_poll() as "active_cred" using the passed file descriptor's f_cred reference to provide access to the file credential. Add an active_cred argument to fo_stat() so that implementers have access to the active credential as well as the file credential. Generally modify callers of fo_stat() to pass in td->td_ucred rather than fp->f_cred, which was redundantly provided via the fp argument. This set of modifications also permits threads to perform these operations on behalf of another thread without modifying their credential.
Trickle this change down into fo_stat/poll() implementations:
- badfo_poll(), badfo_stat(): modify/add arguments. - kqueue_poll(), kqueue_stat(): modify arguments. - pipe_poll(), pipe_stat(): modify/add arguments, pass active_cred to MAC checks rather than td->td_ucred. - soo_poll(), soo_stat(): modify/add arguments, pass fp->f_cred rather than cred to pru_sopoll() to maintain current semantics. - sopoll(): moidfy arguments. - vn_poll(), vn_statfile(): modify/add arguments, pass new arguments to vn_stat(). Pass active_cred to MAC and fp->f_cred to VOP_POLL() to maintian current semantics. - vn_close(): rename cred to file_cred to reflect reality while I'm here. - vn_stat(): Add active_cred and file_cred arguments to vn_stat() and consumers so that this distinction is maintained at the VFS as well as 'struct file' layer. Pass active_cred instead of td->td_ucred to MAC and to VOP_GETATTR() to maintain current semantics.
- fifofs: modify the creation of a "filetemp" so that the file credential is properly initialized and can be used in the socket code if desired. Pass ap->a_td->td_ucred as the active credential to soo_poll(). If we teach the vnop interface about the distinction between file and active credentials, we would use the active credential here.
Note that current inconsistent passing of active_cred vs. file_cred to VOP's is maintained. It's not clear why GETATTR would be authorized using active_cred while POLL would be authorized using file_cred at the file system level.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
101967 |
16-Aug-2002 |
trhodes |
When a cluster entry for ``.'' is set to 0, msdosfs fails to handle it correctly.
PR: 24393 Submitted by: semenu Approved by: rwatson (mentor) MFC after: 1 week
|
101901 |
15-Aug-2002 |
jake |
Fixed 64bit big endian bugs relating to abuse of ioctl argument passing. This makes truss work on sparc64.
|
101895 |
15-Aug-2002 |
scottl |
Clean up comments that are no longer relevant.
|
101890 |
15-Aug-2002 |
scottl |
Factor out some ugle code that's shared by udf_readdir and udf_lookup. Significantly de-obfuscate udf_lookup
Inspired By: tes@sgi.com
|
101777 |
13-Aug-2002 |
phk |
Introduce typedefs for the member functions of struct vfsops and employ these in the main filesystems. This does not change the resulting code but makes the source a little bit more grepable.
Sponsored by: DARPA and NAI Labs.
|
101404 |
05-Aug-2002 |
pb |
Fix typo in vnode flags causing deadlock in msdosfs_fsync().
Reviewed by: jeff
|
101330 |
04-Aug-2002 |
mike |
Fix typo in the last revision.
Noticed by: i386 tinderbox
|
101317 |
04-Aug-2002 |
scottl |
Simplify the handling of a fragmented file_id descriptor. Also de-obfuscate the file_char flags.
|
101308 |
04-Aug-2002 |
jeff |
- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking.
Idea stolen from: BSD/OS
|
101202 |
02-Aug-2002 |
scottl |
Calculate the correct physical block number for files that are embedded into their file_entry descriptor. This is more for correctness, since these files cannot be bmap'ed/mmap'ed anyways. Enforce this restriction.
Submitted by: tes@sgi.com
|
101201 |
02-Aug-2002 |
scottl |
Check for deleted files in udf_lookup(), not just udf_readdir().
Submitted by: tes@sgi.com
|
101200 |
02-Aug-2002 |
alc |
o Lock page queue accesses in nwfs and smbfs. o Assert that the page queues lock is held in vm_page_deactivate().
|
101195 |
02-Aug-2002 |
rwatson |
Introduce support for Mandatory Access Control and extensible kernel access control.
Teach devfs how to respond to pathconf() _POSIX_MAC_PRESENT queries, allowing it to indicate to user processes that individual vnode labels are available.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
101191 |
01-Aug-2002 |
rwatson |
Hook up devfs_pathconf() for specfs devfs nodes, not just regular devfs nodes.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
101132 |
01-Aug-2002 |
rwatson |
Introduce support for Mandatory Access Control and extensible kernel access control.
Modify procfs so that (when mounted multilabel) it exports process MAC labels as the vnode labels of procfs vnodes associated with processes.
Approved by: des Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
101130 |
01-Aug-2002 |
rwatson |
Introduce support for Mandatory Access Control and extensible kernel access control.
Modify pseudofs so that it can support synthetic file systems with the multilabel flag set. In particular, implement vop_refreshlabel() as pn_refreshlabel(). Implement pfs_refreshlabel() to invoke this, and have it fall back to the mount label if the file system does not implement pn_refreshlabel() for the node. Otherwise, permit the file system to determine how the service is provided.
Approved by: des Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
101069 |
31-Jul-2002 |
rwatson |
Introduce support for Mandatory Access Control and extensible kernel access control.
Instrument devfs to support per-dirent MAC labels. In particular, invoke MAC framework when devfs directory entries are instantiated due to make_dev() and related calls, and invoke the MAC framework when vnodes are instantiated from these directory entries. Implement vop_setlabel() for devfs, which pushes the label update into the devfs directory entry for semi-persistant store. This permits the MAC framework to assign labels to devices and directories as they are instantiated, and export access control information via devfs vnodes.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
101002 |
31-Jul-2002 |
semenu |
Fix a problem with sendfile() syscall by always doing I/O via bread() in ntfs_read(). This guarantee that requested cache pages will be valid if UIO_NOCOPY specifed.
PR: bin/34072, bin/36189 MFC after: 1 week
|
100994 |
30-Jul-2002 |
rwatson |
Introduce support for Mandatory Access Control and extensible kernel access control.
Label devfs directory entries, permitting labels to be maintained on device nodes in devfs instances persistently despite vnode recycling.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
100884 |
29-Jul-2002 |
julian |
Create a new thread state to describe threads that would be ready to run except for the fact tha they are presently swapped out. Also add a process flag to indicate that the process has started the struggle to swap back in. This will be needed for the case where multiple threads start the swapin action top a collision. Also add code to stop a process fropm being swapped out if one of the threads in this process is actually off running on another CPU.. that might hurt...
Submitted by: Seigo Tanimura <tanimura@r.dl.itc.u-tokyo.ac.jp>
|
100804 |
28-Jul-2002 |
dd |
Correct misindentation of DRA_UID.
|
100793 |
28-Jul-2002 |
dd |
Unimplement panic(8) by making sure that we don't recurse into a ruleset. If we do, that means there's a ruleset loop (10 includes 20 include 30 includes 10), which will quickly cause a double fault due to stack overflow (since "include" is implemented by recursion). (Previously, we only checked that X didn't include X.)
|
100738 |
27-Jul-2002 |
jeff |
- Explicitly state that specfs does not support locking by using vop_no{lock,unlock,islocked}. This should be the only vnode opv that does so.
|
100737 |
27-Jul-2002 |
alc |
o Lock page queue accesses by vm_page_activate() and vm_page_deactivate().
|
100206 |
17-Jul-2002 |
dd |
Introduce the DEVFS "rule" subsystem. DEVFS rules permit the administrator to define certain properties of new devfs nodes before they become visible to the userland. Both static (e.g., /dev/speaker) and dynamic (e.g., /dev/bpf*, some removable devices) nodes are supported. Each DEVFS mount may have a different ruleset assigned to it, permitting different policies to be implemented for things like jails.
Approved by: phk
|
100164 |
16-Jul-2002 |
markm |
Unbreak LINT; sort the includes so that functions are explicitly declared. Remove duplicate includes.
|
99689 |
09-Jul-2002 |
jeff |
- Change all LK_SHARE locks to LK_EXCLUSIVE. Shared locks aren't quite safe yet - Use vop_std{lock,unlock,islocked}.
|
99566 |
08-Jul-2002 |
jeff |
Lock down pseudofs: - Initialize lock structure in vncache_alloc - Return locked vnodes from vncache_alloc - Setup vnode op vectors to use default lock, unlock, and islocked - Implement simple locking scheme required for lookup
|
99072 |
29-Jun-2002 |
julian |
Part 1 of KSE-III
The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
98266 |
15-Jun-2002 |
mux |
nmount'ify unionfs further by using separate options instead of passing a flags mount options. This removes the include of sys/fs/unionfs/union.h in mount_unionfs as it should be.
Reviewed by: phk
|
98265 |
15-Jun-2002 |
mux |
Convert UDF to nmount.
Reviewed by: scottl
|
98183 |
13-Jun-2002 |
semenu |
Fix a race during null node creation between relookuping the hash and adding vnode to hash. The fix is to use atomic hash-lookup-and-add-if- not-found operation. The odd thing is that this race can't happen actually because the lowervp vnode is locked exclusively now during the whole process of null node creation. This must be thought as a step toward shared lookups.
Also remove vp->v_mount checks when looking for a match in the hash, as this is the vestige.
Also add comments and cosmetic changes.
|
98177 |
13-Jun-2002 |
semenu |
Change null_hashlock into null_hashmtx, because there is no need for lockmgr and this helps to vget() vnode from hash without a race.
Reviewed by: bp MFC after: 2 weeks
|
98176 |
13-Jun-2002 |
semenu |
Fix the "error" path (when dropping not fully initialized vnode). Also move hash operations out of null_vnops.c and explicitly initialize v_lock in null_node_alloc (to set wmesg).
Reviewed by: bp MFC after: 2 weeks
|
98175 |
13-Jun-2002 |
semenu |
Fix wrong locking in null_inactive and null_reclaim. This makes nullfs relatively working back.
Reviewed by: mckusick, bp
|
97940 |
06-Jun-2002 |
des |
Gratuitous whitespace cleanup.
|
97702 |
01-Jun-2002 |
semenu |
Make devfs to give honour to PDIRUNLOCK flag.
Reviewed by: jeff MFC after: 1 week
|
97658 |
31-May-2002 |
tanimura |
Back out my lats commit of locking down a socket, it conflicts with hsu's work.
Requested by: hsu
|
97195 |
24-May-2002 |
mux |
Convert unionfs to nmount.
|
97192 |
24-May-2002 |
mux |
Fix comments.
|
97186 |
23-May-2002 |
mux |
Convert nullfs to nmount.
|
97094 |
22-May-2002 |
bde |
Quick fix for non-unique inode numbers for hard links. We use the byte offset of the directory entry for the inode number for all types of files except directories, although this breaks hard links for non-directories even if it doesn't cause overflow. Just ignore this broken inode number for stat() and readdir() and return a less broken one (the block offset of the file), so that applications normally can't see the brokenness.
This leaves at least the following brokenness: - extra inodes, vnodes and caching for hard links. - various overflow bugs. cd9660 supports 64-bit block numbers, but we silently ignore the top 32 bits in isonum_733() and then drop another 10 bits for our broken inode numbers. We may also have sign extension bugs from storing 32-bit extents in ints and longs even if ints are 32-bits. These bugs affect DVDs. mkisofs apparently limits them by writing directory entries first.
Inode numbers were broken mainly in 4.4BSD-Lite2. FreeBSD-1.1.5 seems to have a correct implementation modulo the overflow bugs. We need to look up directory entries from inodes for symlinks only. FreeBSD-1.1.5 use separate fields (iso_parent_extent, iso_parent) to point to the directory entry. 4.4BSD-Lite doesn't have these, and abuses i_ino to point to the directory entry. Correct pointers are impossible for hard links, but symlinks can't be hard links.
|
97072 |
21-May-2002 |
semenu |
Fix null_lock() not unlocking vp->v_interlock if LK_THISLAYER.
Reviewed by: bp@FreeBSD.org MFC after: 1 week
|
97035 |
21-May-2002 |
tanimura |
Lock the writer socket across sorwakeup(fip->fi_writesock).
Spotted by: peter
|
96972 |
20-May-2002 |
tanimura |
Lock down a socket, milestone 1.
o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket.
o Determine the lock strategy for each members in struct socket.
o Lock down the following members:
- so_count - so_options - so_linger - so_state
o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket:
- sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup()
Reviewed by: alfred
|
96886 |
19-May-2002 |
jhb |
Change p_can{debug,see,sched,signal}()'s first argument to be a thread pointer instead of a proc pointer and require the process pointed to by the second argument to be locked. We now use the thread ucred reference for the credential checks in p_can*() as a result. p_canfoo() should now no longer need Giant.
|
96847 |
18-May-2002 |
phk |
Remove a check of blocknumbers/offsets which will be pointless with 64 bit daddr_t.
Sponsored by: DARPA & NAI Labs.
|
96755 |
16-May-2002 |
trhodes |
More s/file system/filesystem/g
|
96750 |
16-May-2002 |
mux |
In VOP_LOOKUP, don't assume that the final pathname component will be in the same filesystem than the one where the current component is.
Approved by: scottl
|
96572 |
14-May-2002 |
phk |
Make daddr_t and u_daddr_t 64bits wide. Retire daddr64_t and use daddr_t instead.
Sponsored by: DARPA & NAI Labs.
|
96356 |
10-May-2002 |
mux |
Fix several bugs in devfs_lookupx(). When we check the nameiop to make sure it's a correct operation for devfs, do it only in the ISLASTCN case. If we don't, we are assuming that the final file will be in devfs, which is not true if another partition is mounted on top of devfs or with special filenames (like /dev/net/../../foo).
Reviewed by: phk
|
96009 |
04-May-2002 |
jeff |
Include systm.h for panic(9) so that DEBUG_ALL_VFS_LOCKS compiles.
|
95994 |
03-May-2002 |
phk |
HPFS picks up the vop_stdgetpages and vop_stdputpages member functions via the default entry and the default vop vector.
|
95984 |
03-May-2002 |
des |
s/pfs_badop/vop_eopnotsupp/
Submitted by: phk
|
95954 |
02-May-2002 |
mux |
Convert devfs to nmount.
Reviewed by: phk
|
95953 |
02-May-2002 |
mux |
Convert the pseudofs framework to nmount (thus procfs and linprocfs).
Reviewed by: des (some time ago), phk
|
95952 |
02-May-2002 |
mux |
Convert fdescfs to nmount.
Reviewed by: phk
|
95951 |
02-May-2002 |
scottl |
Don't reference vop_std* since they are already implicitly referenced through the VOP_DEFAULT vector
Submitted by: phk
|
95944 |
02-May-2002 |
phk |
Use vop_panic() instead of rolling our own.
|
95913 |
02-May-2002 |
scottl |
In udf_bmap(), return the physical block number, not the logical block number. This fixes things like cp (ouch!) which use mmap.
|
95767 |
30-Apr-2002 |
scottl |
Fix udf_read(). Honor the uio_resid when determining the size of the block to read and copy out. This removes the hack in udf_readatoffset() for only reading one block at a time. WooHoo! Remove a redundant test for fragmented fids in both udf_readdir() and udf_lookup(). Add comment to both as to why the test is written the way it is. Add a few more safety checks for brelse().
Thanks to Timothy Shimmin <tes@boing.melbourne.sgi.com> for pointing out these problems.
|
95759 |
30-Apr-2002 |
tanimura |
Revert the change of #includes in sys/filedesc.h and sys/socketvar.h.
Requested by: bde
Since locking sigio_lock is usually followed by calling pgsigio(), move the declaration of sigio_lock and the definitions of SIGIO_*() to sys/signalvar.h.
While I am here, sort include files alphabetically, where possible.
|
95750 |
29-Apr-2002 |
rwatson |
Use vnode locking with devfs; permit VFS locking assertions to make sense for devfs vnodes, and reduce/remove potential races in the devfs code.
Submitted by: iadowse Approved by: phk
|
95480 |
26-Apr-2002 |
bp |
UIO_NOCOPY is not supported for now, so refuse read opeartion if this flag is set. The full emulation of bio are on its way...
|
95315 |
23-Apr-2002 |
bp |
Track nfs's getpages() changes:
Properly count v_vnodepgsin. Do not reread page if is already valid. Properly handle partially filled pages.
|
95314 |
23-Apr-2002 |
bp |
Get rid from extra #ifdefs.
|
95212 |
21-Apr-2002 |
bde |
Don't attempt to decvlare M_DEVFS whern MALLOC_DECLARE is not defined. This fixes warnings that should be errors in fstat.
Reminded by: alpha tinderbox
Fixed some style bugs (ones near BOF and EOF; there are many more).
|
95210 |
21-Apr-2002 |
bde |
Include <sys/systm.h> for (at least) the definition of atomic functions which are sometimes used by the macros in <sys/mutex.h>; don't depend on not-quite-necessary namespace pollution in <sys/mutex.h>.
|
95094 |
20-Apr-2002 |
marcel |
Don't put a line break in string literals. GCC 3.1 complains and GCC 3.2 drops the ball.
|
95090 |
20-Apr-2002 |
rwatson |
Spelling fix for comment.
|
94995 |
18-Apr-2002 |
alfred |
Cleanup of logic, flow and comments.
Submitted by: bde
|
94861 |
16-Apr-2002 |
jhb |
Lock proctree_lock instead of pgrpsess_lock.
|
94795 |
15-Apr-2002 |
asmodai |
Sync with UDF p4 tree: Use POSIX integer types instead of BSD types.
|
94663 |
14-Apr-2002 |
scottl |
Actually add the UDF files!
|
94637 |
14-Apr-2002 |
jhb |
Remove stale XXX comment.
|
94624 |
13-Apr-2002 |
jhb |
- Change procfs_control()'s first argument to be a thread pointer instead of a process pointer. - Move the p_candebug() at the start of procfs_control() a bit to make locking feasible. We still perform the access check before doing anything, we just now perform it after acquiring locks. - Don't lock the sched_lock for TRACE_WAIT_P() and when checking to see if p_stat is SSTOP. We lock the process while setting p_stat to SSTOP so locking the process is sufficient to do a read to see if p_stat is SSTOP or not.
|
94623 |
13-Apr-2002 |
jhb |
Lock the target process for p_candebug().
|
94622 |
13-Apr-2002 |
jhb |
Lock the target process in procfs_doproc*regs() for p_candebug and while reading/writing the registers.
|
94620 |
13-Apr-2002 |
jhb |
- p_cansee() needs the target process locked. - We need the proc lock held for more of procfs_doprocstatus().
|
94602 |
13-Apr-2002 |
bp |
Check write permissions before creating anything.
PR: kern/27883 MFC after: 1 week
|
94177 |
08-Apr-2002 |
phk |
Remove 3 instances of vm_zone.h inclusion.
|
94167 |
08-Apr-2002 |
jeff |
Change the vm_zone calls over to uma calls. Remove the reference to the vm_zone header.
|
93886 |
05-Apr-2002 |
bde |
Fixed assorted bugs in setting of timestamps in devfs_setattr().
Setting of timestamps on devices had no effect visible to userland because timestamps for devices were set in places that are never used. This broke: - update of file change time after a change of an attribute - setting of file access and modification times.
The VA_UTIMES_NULL case did not work. Revs 1.31-1.32 were supposed to fix this by copying correct bits from ufs, but had little or no effect because the old checks were not removed.
|
93883 |
05-Apr-2002 |
bde |
Fixed a very old bug in setting timestamps using utimes(2) on msdosfs files. We didn't clear the update marks when we set the times, so some of the settings were sometimes clobbered with the current time a little later. This caused cp -p even by root to almost always fail to preserve any times despite not reporting any errors in attempting to preserve them.
Don't forget to set the archive attribute when we set the read-only attribute. We should only set the archive attribute if we actually change something, but we mostly don't bother avoiding setting it elsewhere, so don't bother here yet.
MFC after: 1 week
|
93818 |
04-Apr-2002 |
jhb |
Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
|
93793 |
04-Apr-2002 |
bde |
Moved signal handling and rescheduling from userret() to ast() so that they aren't in the usual path of execution for syscalls and traps. The main complication for this is that we have to set flags to control ast() everywhere that changes the signal mask.
Avoid locking in userret() in most of the remaining cases.
Submitted by: luoqi (first part only, long ago, reorganized by me) Reminded by: dillon
|
93593 |
01-Apr-2002 |
jhb |
Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@
|
93430 |
30-Mar-2002 |
bde |
In ffs_mountffs(), set mnt_iosize_max to si_iosize_max unconditionally provided the latter is nonzero. At this point, the former is a fairly arbitrary default value (DFTPHYS), so changing it to any reasonable value specified by the device driver is safe. Using the maximum of these limits broke ffs clustered i/o for devices whose si_iosize_max is < DFLTPHYS. Using the minimum would break device drivers' ability to increase the active limit from DFTLPHYS up to MAXPHYS.
Copied the code for this and the associated (unnecessary?) fixup of mp_iosize_max to all other filesystems that use clustering (ext2fs and msdosfs). It was completely missing.
PR: 36309 MFC-after: 1 week
|
93393 |
29-Mar-2002 |
alfred |
Protect proc struct (p_args and p_comm) when doing procfs IO that pulls data from it.
Submitted by: Jonathan Mini <mini@haikugeek.com>
|
93075 |
24-Mar-2002 |
bde |
Fixed some style bugs in the removal of __P(()). Continuation lines were not outdented to preserve non-KNF lining up of code with parentheses. Switch to KNF formatting in some cases.
|
93012 |
23-Mar-2002 |
bde |
Fixed some style bugs in the removal of __P(()). Continuation lines were not outdented to preserve non-KNF lining up of code with parentheses. Switch to KNF formatting.
|
92785 |
20-Mar-2002 |
jeff |
Remove references to vm_zone.h and switch over to the new uma API.
|
92765 |
20-Mar-2002 |
alfred |
Remove __P.
|
92755 |
20-Mar-2002 |
alfred |
Remove __P.
|
92727 |
19-Mar-2002 |
alfred |
Remove __P.
|
92540 |
18-Mar-2002 |
mckusick |
Cannot release vnode underlying the nullfs vnode in null_inactive as it leaves the nullfs vnode allocated, but with no identity. The effect is that a null mount can slowly accumulate all the vnodes in the system, reclaiming them only when it is unmounted. Thus the null_inactive state instead accelerates the release of the null vnode by calling vrecycle which will in turn call the null_reclaim operator. The null_reclaim routine then does the freeing actions previosuly (incorrectly) done in null_inactive.
|
92462 |
17-Mar-2002 |
mckusick |
Add a flags parameter to VFS_VGET to pass through the desired locking flags when acquiring a vnode. The immediate purpose is to allow polling lock requests (LK_NOWAIT) needed by soft updates to avoid deadlock when enlisting other processes to help with the background cleanup. For the future it will allow the use of shared locks for read access to vnodes. This change touches a lot of files as it affects most filesystems within the system. It has been well tested on FFS, loopback, and CD-ROM filesystems. only lightly on the others, so if you find a problem there, please let me (mckusick@mckusick.com) know.
|
92363 |
15-Mar-2002 |
mckusick |
Introduce the new 64-bit size disk block, daddr64_t. Change the bio and buffer structures to have daddr64_t bio_pblkno, b_blkno, and b_lblkno fields which allows access to disks larger than a Terabyte in size. This change also requires that the VOP_BMAP vnode operation accept and return daddr64_t blocks. This delta should not affect system operation in any way. It merely sets up the necessary interfaces to allow the development of disk drivers that work with these larger disk block addresses. It also allows for the development of UFS2 which will use 64-bit block addresses.
|
92270 |
14-Mar-2002 |
maxim |
Be consistent with UFS in a way how devfs_setattr() checks credentials for chmod(2), chown(2) and utimes(2) with respect to jail(2).
Reviewed by: rwatson, ru Not objected by: phk Approved by: ru
|
91683 |
05-Mar-2002 |
phk |
If in strategy we find that we have no devsw on the device anymore we are probably talking about some disk-device which wente away, so return ENXIO instead of panicing.
|
91406 |
27-Feb-2002 |
jhb |
Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.
|
91181 |
23-Feb-2002 |
tmm |
Fix LINT breakage by adding a missing include.
|
91140 |
23-Feb-2002 |
tanimura |
Lock struct pgrp, session and sigio.
New locks are:
- pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members.
Please refer to sys/proc.h for the coverage of these locks.
Changes on the pgrp/session interface:
- pgfind() needs the pgrpsess_lock held.
- The caller of enterpgrp() is responsible to allocate a new pgrp and session.
- Call enterthispgrp() in order to enter an existing pgrp.
- pgsignal() requires a pgrp lock held.
Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)
|
90873 |
18-Feb-2002 |
des |
Paranoia: if the process is setugid, set all sensitive files mode 0.
|
90785 |
17-Feb-2002 |
phk |
Don't even think about using v_id for magic tricks, v_id is giving us enough trouble as it is for SMPng.
|
90717 |
16-Feb-2002 |
bde |
FIxed the following style bugs: - clobbering of jsp's $Id$ by FreeBSD's old $Id$. - long lines in recent KSE changes (procfs_ctl.c). - other style bugs in KSE changes (most related to an shadowed variable in procfs_status.c -- the td in the outer scope is obfuscated by PFS_FILL_ARGS).
Approved by: des
|
90716 |
16-Feb-2002 |
bde |
FIxed the following style bugs: - clobbering of jsp's $Id$ by FreeBSD's old $Id$. - lost Berkeley id in procfs_dbregs.c - long lines in recent KSE changes. - various gratuitous differences between procfs_*regs.c.
|
90715 |
16-Feb-2002 |
bde |
Fixed missing PHOLD()/PRELE().
Obtained from: procfs_dbregs.c Approved by: des
|
90489 |
10-Feb-2002 |
phk |
Various nit-picking, mostly of style(9) character.
Obtained from: ~bde/sys.dif.gz
|
90448 |
10-Feb-2002 |
rwatson |
Part I: Update extended attribute API and ABI:
o Modify the system call syntax for extattr_{get,set}_{fd,file}() so as not to use the scatter gather API (which appeared not to be used by any consumers, and be less portable), rather, accepts 'data' and 'nbytes' in the style of other simple read/write interfaces. This changes the API and ABI.
o Modify system call semantics so that extattr_get_{fd,file}() return a size_t. When performing a read, the number of bytes read will be returned, unless the data pointer is NULL, in which case the number of bytes of data are returned. This changes the API only.
o Modify the VOP_GETEXTATTR() vnode operation to accept a *size_t argument so as to return the size, if desirable. If set to NULL, the size will not be returned.
o Update various filesystems (pseodofs, ufs) to DTRT.
These changes should make extended attributes more useful and more portable. More commits to rebuild the system call files, as well as update userland utilities to follow.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
90361 |
07-Feb-2002 |
julian |
Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out.
Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
|
90206 |
04-Feb-2002 |
rwatson |
Change EPERM to EOPNOTSUPP when failing pseudofs_setattr() arbitrarily.
Quoth the alfred: The latter would be better.
|
90205 |
04-Feb-2002 |
rwatson |
Return EPERM instead of 0 in the un-implemented pseudofs_setattr(). Conceivably, it should even return EOPNOTSUPP.
|
89376 |
14-Jan-2002 |
alfred |
Fix select on fifos.
Backout revision 1.56 and 1.57 of fifo_vnops.c.
Introduce a new poll op "POLLINIGNEOF" that can be used to ignore EOF on a fifo, POLLIN/POLLRDNORM is converted to POLLINIGNEOF within the FIFO implementation to effect the correct behavior.
This should allow one to view a fifo pretty much as a data source rather than worry about connections coming and going.
Reviewed by: bde
|
89372 |
14-Jan-2002 |
semenu |
Commit a know fix for hpfs to use vop_defaultop plug instead of wrong hpfs_bypass() routine.
MFC after: 1 day
|
89325 |
14-Jan-2002 |
alfred |
don't initialize the mutex in the temporary struct file, the soo_* functions just grab f_data and don't muck with anything else so this should be ok.
this fixes a panic with invariants where it thinks we've doubly initialized the filetmp mutex even though all we've done is neglect to bzero it.
|
89319 |
14-Jan-2002 |
alfred |
Replace ffind_* with fget calls.
Make fget MPsafe.
Make fgetvp and fgetsock use the fget subsystem to reduce code bloat.
Push giant down in fpathconf().
|
89317 |
13-Jan-2002 |
alfred |
remove unused socket pointer
|
89316 |
13-Jan-2002 |
alfred |
Include sys/_lock.h and sys/_mutex.h to reduce namespace pollution.
Requested by: jhb
|
89306 |
13-Jan-2002 |
alfred |
SMP Lock struct file, filedesc and the global file list.
Seigo Tanimura (tanimura) posted the initial delta.
I've polished it quite a bit reducing the need for locking and adapting it for KSE.
Locks:
1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked.
1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex.
1 sx lock for the global filelist.
struct file * fhold(struct file *fp); /* increments reference count on a file */
struct file * fhold_locked(struct file *fp); /* like fhold but expects file to locked */
struct file * ffind_hold(struct thread *, int fd); /* finds the struct file in thread, adds one reference and returns it unlocked */
struct file * ffind_lock(struct thread *, int fd); /* ffind_hold, but returns file locked */
I still have to smp-safe the fget cruft, I'll get to that asap.
|
89118 |
09-Jan-2002 |
msmith |
Add a new sysinit SI_SUB_DEVFS. Devfs hooks into the kernel at SI_ORDER_FIRST, and devices can be created anytime after that.
Print a warning if an atttempt is made to create a device too early.
|
89107 |
09-Jan-2002 |
msmith |
Use a sysinit to initialise the devfs hooks in kern_conf.c rather than common variables.
Reviewed by: phk (in principle)
|
89090 |
08-Jan-2002 |
msmith |
Staticise the coda vfsop pointer.
|
89071 |
08-Jan-2002 |
msmith |
Staticise pfs_vncache, it's not used anywhere else.
Reviewed by: des
|
88868 |
04-Jan-2002 |
tanimura |
Do not derefer null.
Reviewed by: des
|
88739 |
31-Dec-2001 |
rwatson |
o Make the credential used by socreate() an explicit argument to socreate(), rather than getting it implicitly from the thread argument.
o Make NFS cache the credential provided at mount-time, and use the cached credential (nfsmount->nm_cred) when making calls to socreate() on initially connecting, or reconnecting the socket.
This fixes bugs involving NFS over TCP and ipfw uid/gid rules, as well as bugs involving NFS and mandatory access control implementations.
Reviewed by: freebsd-arch
|
88318 |
20-Dec-2001 |
dillon |
Fix a BUF_TIMELOCK race against BUF_LOCK and fix a deadlock in vget() against VM_WAIT in the pageout code. Both fixes involve adjusting the lockmgr's timeout capability so locks obtained with timeouts do not interfere with locks obtained without a timeout.
Hopefully MFC: before the 4.5 release
|
88279 |
20-Dec-2001 |
bp |
Previous commit was intented to silence a warning, not to change codepath.
|
88263 |
20-Dec-2001 |
sheldonh |
Silence harmless "smbfs_closel: Negative opencount" messages at unmount time.
Thanks to iedowse for the background information.
Submitted by: bp
|
88234 |
19-Dec-2001 |
dillon |
Pseudofs was leaking VFS cache entries badly due to its cache and use of the wrong VOP descriptor. This misuse caused VFS-cached vnodes to be re-cached, resulting in the leak. This commit is an interim fix until DES has a chance to rework the code involved.
|
87798 |
13-Dec-2001 |
sheldonh |
Add module dependency on libmchain.
With this change, mounting an smb share (using mount_smb, which is not yet included in the tree) without any of smbfs, libiconv or libmchain compiled into the kernel or loaded works.
|
87725 |
12-Dec-2001 |
alfred |
Fix select on named pipes without a reader.
PR: kern/19871 MFC after: 1 month
|
87670 |
11-Dec-2001 |
green |
Add VOP_GETEXTATTR(9) passthrough support to pseudofs.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
87669 |
11-Dec-2001 |
des |
Remove an obsolete prototype for procfs_kmemaccess().
Submitted by: rwatson
|
87599 |
10-Dec-2001 |
obrien |
Update to C99, s/__FUNCTION__/__func__/, also don't use ANSI string concatenation.
|
87542 |
09-Dec-2001 |
des |
Fix various bugs in the debugging code and reenable it.
|
87541 |
09-Dec-2001 |
des |
Fix an incorrect PFS_TRACE. Also, use __func__ instead of __FUNCTION__.
|
87538 |
08-Dec-2001 |
des |
Fix a KSEfication brain-o in procfs_doprocfile(): return the path of the target process, not the calling process. While we're here, also unstaticize procfs_doprocfile() and procfs_docurproc() so linprocfs can call them directly instead of duplicating them.
Submitted by: Dominic Mitchell <dom@semantico.com>
|
87321 |
04-Dec-2001 |
des |
Pseudofsize procfs(5).
|
87275 |
03-Dec-2001 |
rwatson |
o Introduce pr_mtx into struct prison, providing protection for the mutable contents of struct prison (hostname, securelevel, refcount, pr_linux, ...) o Generally introduce mtx_lock()/mtx_unlock() calls throughout kern/ so as to enforce these protections, in particular, in kern_mib.c protection sysctl access to the hostname and securelevel, as well as kern_prot.c access to the securelevel for access control purposes. o Rewrite linux emulator abstractions for accessing per-jail linux mib entries (osname, osrelease, osversion) so that they don't return a pointer to the text in the struct linux_prison, rather, a copy to an array passed into the calls. Likewise, update linprocfs to use these primitives. o Update in_pcb.c to always use prison_getip() rather than directly accessing struct prison.
Reviewed by: jhb
|
87194 |
02-Dec-2001 |
bp |
Catch up with KSE changes.
Submitted by: Max Khon <fjoe@iclub.nsu.ru>
|
87068 |
28-Nov-2001 |
jhb |
Fix indentation after removing GEMDOS support. Whitespace changes only.
|
87067 |
28-Nov-2001 |
jhb |
Use suser_td() instead of explicitly checking cr_uid against 0.
PR: kern/21809 Submitted by: <mbendiks@eunet.no> Reviewed by: rwatson
|
87061 |
28-Nov-2001 |
jhb |
Axe more unused GEMDOS code that was #ifdef atari.
PR: kern/21809 Submitted by: <mbendiks@eunet.no>
|
87007 |
27-Nov-2001 |
jhb |
Remove GEMDOS support from msdosfs. I don't think anyone is going to port FreeBSD to Atari machines any time soon.
|
86969 |
27-Nov-2001 |
des |
Add support for a last-close handler. Revert the module version bumps; they're quite pointless as long as the only pseudofs consumer is linprocfs, which is in the tree.
|
86941 |
27-Nov-2001 |
ken |
Fix mounting root from a ISO9660 filesystem on a SCSI CDROM.
The problem was that the ISO9660 code wasn't opening the device prior to issuing ioctl calls. In particular, the device must be open before iso_get_ssector() is called in iso_mountroot().
If the device isn't opened first, the disk layer blows up due to an uninitialized variable.
The solution was to open the device, call iso_get_ssector() and then close it again.
The ATAPI CDROM driver doesn't have this problem because it doesn't use the disk layer, and evidently doesn't mind if someone issues an ioctl without first issuing an open call.
Thanks to phk for pointing me at the source of this problem.
Tested by: dirk MFC after: 1 week
|
86931 |
27-Nov-2001 |
jhb |
Replace 'p' with 'td' as appropriate.
|
86930 |
27-Nov-2001 |
jhb |
GC compat macros HASHINIT, VOP__LOCK, VOP__UNLOCK, VGET, and VN_LOCK.
|
86929 |
27-Nov-2001 |
jhb |
Expand LOCKMGR() compat macro.
|
86928 |
26-Nov-2001 |
jhb |
GC some KSE compatiblity macros that were somehow still here.
|
86927 |
26-Nov-2001 |
jhb |
GC non-FreeBSD code that didn't work anyways.
|
86892 |
25-Nov-2001 |
dd |
Address two minor issues: implement the _PC_NAME_MAX and _PC_PATH_MAX pathconf() variables for directories, and set st_size and st_blocks (of struct stat) for directories as appropriate. Note that st_size is always set to DEV_BSIZE, since the size of the directories is not currently kept.
Reviewed by: phk, bde
|
86872 |
24-Nov-2001 |
dillon |
convert holdsock() to fget(). Add XXX reminder for future socket locking.
|
86481 |
17-Nov-2001 |
peter |
Missing KSE s/curproc/curthread/
|
86185 |
08-Nov-2001 |
alfred |
Switch behavior of fifos to more closely match what goes on in other OSes. Basically FIFOs become a real pain to abuse as a rendevous point without this change because you can't really select(2) on them because they always return ready even though there is no writer (to signal EOF).
Obtained from: BSD/os
|
86165 |
07-Nov-2001 |
peter |
Fix printf format bugs introduced in rev 1.34 for printing times. quad_t cannot be printed with %lld on 64 bit systems.
Dont waste cpu to round user and system times up to long long, it is highly improbable that a process will have accumulated 68 years of user or system cpu time (not wall clock time) before a reboot or process restart.
|
86136 |
06-Nov-2001 |
green |
Correctly unlock the target process if /proc/$foo/mem is open()ed by another process which cannot p_candebug() it. The bug was introduced in rev. 1.100.
Approved by: des
|
86056 |
04-Nov-2001 |
dillon |
Fix the fix. BIO_ERROR must be set in b_ioflags, not b_flags
|
86040 |
04-Nov-2001 |
phk |
Fix "echo > /dev/null" for non-root users which broke in previous commit.
|
86037 |
04-Nov-2001 |
dillon |
Add mnt_reservedvnlist so we can MFC to 4.x, in order to make all mount structure changes now rather then piecemeal later on. mnt_nvnodelist currently holds all the vnodes under the mount point. This will eventually be split into a 'dirty' and 'clean' list. This way we only break kld's once rather then twice. nvnodelist will eventually turn into the dirty list and should remain compatible with the klds.
|
86009 |
04-Nov-2001 |
phk |
B_ERROR is BIO_ERROR on -current.
Now it compiles, I don't know if it works.
|
86003 |
04-Nov-2001 |
dillon |
Fix a bug in CD9660 when vmiodirenable is turned on. CD9660 was assuming that a buffer's b_blkno would be valid. This is true when vmiodirenable is turned off because the B_MALLOC'd buffer's data is invalidated when the buffer is destroyed. But when vmiodirenable is turned on a buffer can be reconstituted from its VMIO backing store. The reconstituted buffer will have no knowledge of the physical block translation and the result is serious directory corruption of the CDROM.
The solution is to fix cd9660_blkatoff() to always BMAP the buffer if b_lblkno == b_blkno.
MFC after: 0 days
|
85980 |
03-Nov-2001 |
phk |
Use vfs_timestamp() instead of getnanotime().
Add magic stuff copied from ufs_setattr().
Instructed by: bde
|
85979 |
03-Nov-2001 |
phk |
Use vfs_timestamp() instead of getnanotime() directly. Fix some modes on directories and symlinks.
Instructed by: bde
|
85940 |
03-Nov-2001 |
des |
Reduce the number of #include dependencies by declaring some of the structs used in pseudofs.h as opaque structs.
|
85644 |
28-Oct-2001 |
dillon |
Adjust printfs to be time_t agnostic.
|
85561 |
26-Oct-2001 |
des |
Add VOP_IOCTL support, and fix a bug that would cause a panic if a file or symlink lacked a filler function.
|
85339 |
23-Oct-2001 |
dillon |
Change the vnode list under the mount point from a LIST to a TAILQ in preparation for an implementation of limiting code for kern.maxvnodes.
MFC after: 3 days
|
85320 |
22-Oct-2001 |
des |
No, you may not /* FALLTHROUGH */. Not only will you return an incorrect result, but you'd corrupt the kernel malloc() arena if it weren't for a small but life-saving optimization in ioctl().
MFC after: 1 week
|
85297 |
21-Oct-2001 |
des |
Move procfs_* from procfs_machdep.c into sys_process.c, and rename them to proc_* in the process; procfs_machdep.c is no longer needed.
Run-tested on i386, build-tested on Alpha, untested on other platforms.
|
85208 |
20-Oct-2001 |
jhb |
Assert that a ucred is unshared before we remap its ids.
|
85180 |
19-Oct-2001 |
des |
Argh! I updated the version number in the MODULE_DEPEND() thingamagook but not in the actual MODULE_VERSION(). Pass me the pointy hat.
|
85128 |
19-Oct-2001 |
des |
Switch to dynamic rather than static initialization. This makes it possible (in theory) for nodes to be added and / or removed from pseudofs filesystems at runtime.
|
84874 |
13-Oct-2001 |
bde |
Fixed bitrot in a banal comment by removing the comment.
|
84873 |
13-Oct-2001 |
bde |
Backed out vestiges of the quick fixes for the transient breakage of <sys/mount.h> in rev.1.106 of the latter (don't include <sys/socket.h> just to work around bugs in <sys/mount.h>).
|
84827 |
11-Oct-2001 |
jhb |
Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.
|
84811 |
11-Oct-2001 |
jhb |
Add missing includes of sys/lock.h.
|
84637 |
07-Oct-2001 |
des |
Dissociate ptrace from procfs.
Until now, the ptrace syscall was implemented as a wrapper that called various functions in procfs depending on which ptrace operation was requested. Most of these functions were themselves wrappers around procfs_{read,write}_{,db,fp}regs(), with only some extra error checks, which weren't necessary in the ptrace case anyway.
This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c (renaming it to proc_rwmem() in the process), and implements ptrace() directly in terms of procfs_{read,write}_{,db,fp}regs() instead of having it fake up a struct uio and then call procfs_do{,db,fp}regs().
It also moves the prototypes for procfs_{read,write}_{,db,fp}regs() and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files except procfs_machdep.c as "optional procfs" instead of "standard".
|
84634 |
07-Oct-2001 |
des |
Remove some useless preprocesor paranoia.
|
84633 |
07-Oct-2001 |
des |
In procfs_readdir(), when the directory being read was a process directory, the target process was being held locked during the uiomove() call. If the process calling readdir() was the same as the target process (for instance 'ls /proc/curproc/'), and uiomove() caused a page fault, the result would be a proc lock recursion. I have no idea how long this has been broken - possibly ever since pfind() was changed to lock the process it returns.
Also replace the one and only call to procfs_findtextvp() with a direct test of td->td_proc->p_textvp.
|
84386 |
02-Oct-2001 |
des |
Add a PFS_DISABLED flag; pfs_visible() automatically returns 0 if it is set on the node in question. Also add two API functions for setting and clearing this flag; setting it also reclaims all vnodes associated with the node.
|
84383 |
02-Oct-2001 |
des |
Only print "XXX (un)registered" message if bootverbose.
|
84247 |
01-Oct-2001 |
des |
[the previous commit to pseudofs_vncache.c got the wrong log message]
YA pseudofs megacommit, part 2:
- Merge the pfs_vnode and pfs_vdata structures, and make the vnode cache a doubly-linked list. This eliminates the need to walk the list in pfs_vncache_free().
- Add an exit callout which revokes vnodes associated with the process that just exited. Since it needs to lock the cache when it does this, pfs_vncache_mutex needs MTX_RECURSE.
|
84246 |
01-Oct-2001 |
des |
YA pseudofs megacommit, part 1:
- Add a third callback to the pfs_node structure. This one simply returns non-zero if the specified requesting process is allowed to access the specified node for the specified target process. This is used in addition to the usual permission checks, e.g. when certain files don't make sense for certain (system) processes.
- Make sure that pfs_lookup() and pfs_readdir() don't yap about files which aren't pfs_visible(). Also check pfs_visible() before performing reads and writes, to prevent the kind of races reported in SA-00:77 and SA-01:55 (fork a child, open /proc/child/ctl, have that child fork a setuid binary, and assume control of it).
- Add some more trace points.
|
84187 |
30-Sep-2001 |
des |
pseudofs.h:
- Rearrange the flag constants a little to simplify specifying and testing for readability and writeability.
pseudofs_vnops.c:
- Track the aforementioned change.
- Add checks to pfs_open() to prevent opening read-only files for writing or vice versa (pfs_{read,write} would block the actual reads and writes, but it's still a bug to allow the open() to succeed). Also, return EOPNOTSUPP if the caller attempts to lock the file.
- Add more trace points.
|
84156 |
30-Sep-2001 |
phk |
The behaviour of whiteout'ing symlinks were too confusing, instead remove them when asked to.
|
84098 |
29-Sep-2001 |
des |
Pseudofs take 2:
- Remove hardcoded uid, gid, mode from struct pfs_node; make pfs_getattr() smart enough to get it right most of the time, and allow for callbacks to handle the remaining cases. Rework the definition macros to match.
- Add lots of (conditional) debugging output.
- Fix a long-standing bug inherited from procfs: don't pretend to be a read-only file system. Instead, return EOPNOTSUPP for operations we truly can't support and allow others to fail silently. In particular, pfs_lookup() now treats CREATE as LOOKUP. This may need more work.
- In pfs_lookup(), if the parent node is process-dependent, check that the process in question still exists.
- Implement pfs_open() - its only current function is to check that the process opening the file can see the process it belongs to.
- Finish adding support for writeable nodes.
- Bump module version number.
- Introduce lots of new bugs.
|
84082 |
28-Sep-2001 |
des |
The previous commit introduced some references to "curproc" which should have been references to "curthread". Correct this.
|
83978 |
26-Sep-2001 |
rwatson |
o Modify generic specfs device open access control checks to use securelevel_ge() instead of direct securelevel variable checks.
Obtained from: TrustedBSD Project
|
83949 |
26-Sep-2001 |
fenner |
Fix (typo? pasteo?): panic("ffs_mountroot..." -> panic("ntfs_mountroot...")
|
83927 |
25-Sep-2001 |
des |
Clean up my source tree to avoid getting hit too badly by the next KSE or whatever mega-commit. This goes some way towards adding support for writeable files (needed by procfs).
|
83920 |
25-Sep-2001 |
mike |
A process name may contain whitespace and unprintable characters, so convert those characters to octal notation. Also convert backslashes to octal notation to avoid confusion.
Reviewed by: des MFC after: 1 week
|
83804 |
21-Sep-2001 |
jhb |
Use the passed in thread to selrecord() instead of curthread.
|
83635 |
18-Sep-2001 |
rwatson |
o Remove redundant securelevel/pid1 check in procfs_rw() -- this protection is enforced at the invidual method layer using p_candebug().
Obtained from: TrustedBSD Project
|
83417 |
13-Sep-2001 |
julian |
fix typo pointed out by: jhb
|
83384 |
12-Sep-2001 |
jhb |
Restore these files to being portable: - Use some simple #define's at the top of the files for proc -> thread changes instead of having lots of needless #ifdef's in the code. - Don't try to use struct thread in !FreeBSD code. - Don't use a few struct lwp's in some of the NetBSD code since it isn't in their HEAD. The new diff relative to before KSE is now signficantly smaller and easier to maintain.
|
83366 |
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
83291 |
10-Sep-2001 |
kris |
Fix some signed/unsigned integer confusion, and add bounds checking of arguments to some functions.
Obtained from: NetBSD Reviewed by: peter MFC after: 2 weeks
|
83229 |
08-Sep-2001 |
semenu |
Stole unicode translation table from mount_msdos. Add kernel code to support this translation.
MFC after: 2 weeks
|
83227 |
08-Sep-2001 |
semenu |
Fix opening particular file's attributes (as described in man page). This is useful for debug purposes.
MFC after: 2 weeks
|
83226 |
08-Sep-2001 |
semenu |
Reference devvp on ntnode creation and dereference on removal. Previous code lead to page faults becouse i_devvp went zero after VOP_RECLAIM, but ntnode was reused (not reclaimed).
MFC after: 2 weeks
|
83225 |
08-Sep-2001 |
semenu |
Fix errors and warnings when compiling with NTFS_DEBUG > 1
MFC after: 2 weeks
|
82517 |
29-Aug-2001 |
ache |
smbfs_advlock: simplify overflow checks (copy from kern_lockf.c) minor formatting issues to minimize differences
|
82347 |
26-Aug-2001 |
ache |
Cosmetique & style fixes from bde
|
82270 |
24-Aug-2001 |
ache |
Copy from kern_lockf.c: remove extra check
|
82210 |
23-Aug-2001 |
ache |
Copy yet one check for SEEK_END overflow
|
82203 |
23-Aug-2001 |
ache |
Copy my newly introduced l_len<0 'oops' fix from kern_lockf.c
|
82201 |
23-Aug-2001 |
ache |
Copy POSIX l_len<0 handling from kern_lockf.c
|
82196 |
23-Aug-2001 |
ache |
Cosmetique: correct English in comments non-cosmetique: add missing break; - original code was broken here
|
82190 |
23-Aug-2001 |
ache |
Move <machine/*> after <sys/*>
Pointed by: bde
|
82175 |
23-Aug-2001 |
ache |
adv. lock: copy EOVERFLOW handling code from main variant fix type of 'size' arg
|
82039 |
21-Aug-2001 |
bp |
Use proper endian conversion.
Obtained from: Mac OS X MFC after: 1 week
|
82038 |
21-Aug-2001 |
bp |
Return proper length of _PC_NAME_MAX value if long names support is enabled.
Obtained from: Mac OS X MFC after: 1 week
|
81620 |
14-Aug-2001 |
phk |
linux ls fails on DEVFS /dev because linux_getdents fails because linux_getdents uses VOP_READDIR( ..., &ncookies, &cookies ) instead of VOP_READDIR( ..., NULL, NULL ) because it seems to need the offsets for linux_dirent and sizeof(dirent) != sizeof(linux_dirent)...
PR: 29467 Submitted by: Michael Reifenberger <root@nihil.plaut.de> Reviewed by: phk
|
81112 |
03-Aug-2001 |
rwatson |
Remove dangling prototype for the now defunct procfs_kmemaccess() call.
Obtained from: TrustedBSD Project
|
81109 |
03-Aug-2001 |
rwatson |
Collapse a Pmem case in with the other debugging files case for procfs, as there are now "unusual" protection properties to Pmem that differ from the other files. While I'm at it, introduce proc locking for the other files, which was previously present only in the Pmem case.
Obtained from: TrustedBSD Project
|
81108 |
03-Aug-2001 |
rwatson |
Remove read permission for group on the /proc/*/mem file, since kmem no longer requires access.
Reviewed by: tmm Obtained from: TrustedBSD Project
|
81107 |
03-Aug-2001 |
rwatson |
Prior to support for almost all ps activity via sysctl, ps used procfs, and so special-casing was introduced to provide extra procfs privilege to the kmem group. With the advent of non-setgid kmem ps, this code is no longer required, and in fact, can is potentially harmful as it allocates privilege to a gid that is increasingly less meaningful. Knowledge of specific gid's in kernel is also generally bad precedent, as the kernel security policy doesn't distinguish gid's specifically, only uid 0.
This commit removes reference to kmem in procfs, both in terms of access control decisions, and the applying of gid kmem to the /proc/*/mem file, simplifying the associated code considerably. Processes are still permitted to access the mem file based on the debugging policy, so ps -e still works fine for normal processes and use.
Reviewed by: tmm Obtained from: TrustedBSD Project
|
79996 |
19-Jul-2001 |
assar |
remove support for creating files and directories from msdosfs_mknod
|
79872 |
18-Jul-2001 |
jhb |
Grab the process lock around psignal().
Noticed by: tanimura
|
79335 |
05-Jul-2001 |
rwatson |
o Replace calls to p_can(..., P_CAN_xxx) with calls to p_canxxx(). The p_can(...) construct was a premature (and, it turns out, awkward) abstraction. The individual calls to p_canxxx() better reflect differences between the inter-process authorization checks, such as differing checks based on the type of signal. This has a side effect of improving code readability. o Replace direct credential authorization checks in ktrace() with invocation of p_candebug(), while maintaining the special case check of KTR_ROOT. This allows ktrace() to "play more nicely" with new mandatory access control schemes, as well as making its authorization checks consistent with other "debugging class" checks. o Eliminate "privused" construct for p_can*() calls which allowed the caller to determine if privilege was required for successful evaluation of the access control check. This primitive is currently unused, and as such, serves only to complicate the API.
Approved by: ({procfs,linprocfs} changes) des Obtained from: TrustedBSD Project
|
79245 |
04-Jul-2001 |
jhb |
- Update the vmmeter statistics for vnode pageins and pageouts in getpages/putpages. - Use vm_page_undirty() instead of messing with pages' dirty fields directly.
|
79224 |
04-Jul-2001 |
dillon |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
78907 |
28-Jun-2001 |
jhb |
Fix a mntvnode and vnode interlock reversal.
|
78906 |
28-Jun-2001 |
jhb |
Protect the mnt_vnode list with the mntvnode lock.
|
78274 |
15-Jun-2001 |
des |
#if 0 out pfs_null() to silence the warning about it not being referenced.
|
78244 |
15-Jun-2001 |
peter |
Fix warning: 568: warning: `portal_badop' defined but not used
|
78242 |
15-Jun-2001 |
peter |
Fix warning (exposed NetBSD code): 94: warning: `ntfs_bmap' declared `static' but never defined
|
78241 |
15-Jun-2001 |
peter |
Fix warnings (mostly harmless, due to struct bio being embedded in buf): 738: warning: passing arg 1 of `biodone' from incompatible pointer type 745: warning: passing arg 1 of `biodone' from incompatible pointer type
|
78240 |
15-Jun-2001 |
peter |
Fix warning: 552: warning: `fdesc_badop' defined but not used
|
78229 |
15-Jun-2001 |
peter |
Warning fix: coda_fbsd.c:113: warning: unused variable `ret'
|
78205 |
14-Jun-2001 |
bp |
Coda do not call vop_defaultop(), so add nesessary calls for VM objects.
Submitted by: Greg Troxel <gdt@ir.bbn.com> MFC after: 2 days
|
78179 |
13-Jun-2001 |
mjacob |
the last argument to copyinstr is of t ype size_t, not u_int
|
78161 |
13-Jun-2001 |
peter |
With this commit, I hereby pronounce gensetdefs past its use-by date.
Replace the a.out emulation of 'struct linker_set' with something a little more flexible. <sys/linker_set.h> now provides macros for accessing elements and completely hides the implementation.
The linker_set.h macros have been on the back burner in various forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()), John Polstra (ELF clue) and myself (cleaned up API and the conversion of the rest of the kernel to use it).
The macros declare a strongly typed set. They return elements with the type that you declare the set with, rather than a generic void *.
For ELF, we use the magic ld symbols (__start_<setname> and __stop_<setname>). Thanks to Richard Henderson <rth@redhat.com> for the trick about how to force ld to provide them for kld's.
For a.out, we use the old linker_set struct.
NOTE: the item lists are no longer null terminated. This is why the code impact is high in certain areas.
The runtime linker has a new method to find the linker set boundaries depending on which backend format is in use.
linker sets are still module/kld unfriendly and should never be used for anything that may be modular one day.
Reviewed by: eivind
|
78073 |
11-Jun-2001 |
des |
For some reason, though the module builds just fine without <sys/lock.h>, LINT fails to build without it.
|
78018 |
10-Jun-2001 |
des |
Bail out if the fill function failed.
|
78017 |
10-Jun-2001 |
des |
Whoops, some of my test code snuck in here.
|
78003 |
10-Jun-2001 |
des |
Argh. Fix braino in previous commit.
|
78001 |
10-Jun-2001 |
des |
Add a 'flags' argument to the PFS_PROCDIR macro.
|
77998 |
10-Jun-2001 |
des |
Add support for process-dependent directories. This means that save for the lack of a man page, pseudofs is mostly complete now.
|
77967 |
10-Jun-2001 |
des |
Blah, not my day. This file needs <sys/mutex.h> now.
|
77966 |
10-Jun-2001 |
des |
Remember to unlock the process pfind() returns.
|
77965 |
10-Jun-2001 |
des |
Add missing #include of <sys/mutex.h>.
|
77964 |
10-Jun-2001 |
des |
Catch up with the change in sbuf_new's prototype.
|
77821 |
06-Jun-2001 |
jlemon |
The kq write filter was hooked up to the wrong socket, and thus was not behaving correctly. Fix by attaching to the correct socket.
Also call so{rw}wakeup in addition to the fifo wakeup, so that any kqfilters attached to the socket buffer get poked.
|
77799 |
06-Jun-2001 |
tanimura |
Lock VM Giant prior to locking a vm map.
Spotted by: Daniel Rock <D.Rock@t-online.de> Tested by: David Wolfskill <david@catwhisker.org>, Sean Eric Fagan <sef@kithrup.com>
|
77784 |
05-Jun-2001 |
shafeeq |
Now works again and as a module and with devfs. Used the bpf & tun drivers as examples as to what is necessary for devfs.
|
77589 |
01-Jun-2001 |
brian |
Support /dev/tun cloning. Ansify if_tun.c while I'm there.
Only tun0 -> tun32767 may now be opened as struct ifnet's if_unit is a short.
It's now possible to open /dev/tun and get a handle back for an available tun device (use devname to find out what you got).
The implementation uses rman by popular demand (and against my judgement) to track opened devices and uses the new dev_depends() to ensure that all make_dev()d devices go away before the module is unloaded.
Reviewed by: phk
|
77577 |
01-Jun-2001 |
ru |
- VFS_SET(msdos) -> VFS_SET(msdosfs) - msdos.ko -> msdosfs.ko - mount_msdos(8) -> mount_msdosfs(8) - "msdos" -> "msdosfs" compatibility glue in mount(8)
|
77243 |
26-May-2001 |
phk |
Don't copy the trailing zero in readlink, it confuses namei().
PR: 27656
|
77223 |
26-May-2001 |
ru |
- sys/n[tw]fs moved to sys/fs/n[tw]fs - /usr/include/n[tw]fs moved to /usr/include/fs/n[tw]fs
|
77215 |
26-May-2001 |
phk |
Create a general facility for making dev_t's depend on another dev_t. The dev_depends(dev_t, dev_t) function is for tying them to each other.
When destroy_dev() is called on a dev_t, all dev_t's depending on it will also be destroyed (depth first order).
Rewrite the make_dev_alias() to use this dependency facility.
kern/subr_disk.c: Make the disk mini-layer use dependencies to make sure all relevant dev_t's are removed when the disk disappears.
Make the disk mini-layer precreate some magic sub devices which the disk/slice/label code expects to be there.
kern/subr_disklabel.c: Remove some now unneeded variables.
kern/subr_diskmbr.c: Remove some ancient, commented out code.
kern/subr_diskslice.c: Minor cleanup. Use name from dev_t instead of dsname()
|
77183 |
25-May-2001 |
rwatson |
o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account.
Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit
|
77162 |
25-May-2001 |
ru |
- sys/msdosfs moved to sys/fs/msdosfs - msdos.ko renamed to msdosfs.ko - /usr/include/msdosfs moved to /usr/include/fs/msdosfs
|
77133 |
24-May-2001 |
ru |
Actually rename FDESC, PORTAL, UMAP and UNION file systems.
OK'ed by: bp
|
77131 |
24-May-2001 |
ru |
mount_umap(8) -> mount_umapfs(8).
|
77130 |
24-May-2001 |
ru |
mount_null(8) -> mount_nullfs(8).
|
77084 |
23-May-2001 |
jhb |
Don't acquire/release Giant around some of the places that need it in spec_getpages(). Instead, assert that Giant is held by the caller.
|
77050 |
23-May-2001 |
phk |
Change the way deletes are managed in DEVFS.
This fixes a number of warnings relating to removed cloned devices.
It also makes it possible to recreate deleted devices with mknod(2). The major/minor arguments are ignored.
|
77031 |
23-May-2001 |
ru |
- FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file systems were repo-copied from sys/miscfs to sys/fs.
- Renamed the following file systems and their modules: fdesc -> fdescfs, portal -> portalfs, union -> unionfs.
- Renamed corresponding kernel options: FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS.
- Install header files for the above file systems.
- Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland Makefiles.
|
76945 |
21-May-2001 |
jhb |
Sort includes from previous commit.
|
76827 |
19-May-2001 |
alfred |
Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level vm operations.
faults can not be taken without holding Giant.
Memory subsystems can now call the base page allocators safely.
Almost all atomic ops were removed as they are covered under the vm mutex.
Alpha and ia64 now need to catch up to i386's trap handlers.
FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties).
Reviewed (partially) by: jake, jhb
|
76797 |
18-May-2001 |
bp |
Currently there is no way to tell if write operation invoked via vn_start_write() on the given vnode will be successful. VOP_LEASE() may help to solve this problem, but its return value ignored nearly everywhere. For now just assume that the missing upper layer on write means insufficient access rights (which is correct for most cases).
|
76718 |
17-May-2001 |
bp |
VOP getwritemount() can be invoked on vnodes with VFREE flag set (used in snapshots code). At this point upper vp may not exist.
|
76716 |
17-May-2001 |
bp |
Use vop_*vobject() VOPs to get reference to VM object from upper or lower fs.
|
76715 |
17-May-2001 |
bp |
Do not leave an extra reference on vnode.
PR: kern/27250 Submitted by: "Vladimir B. Grebenschikov" <vova@express.ru> MFC after: 2 weeks
|
76688 |
16-May-2001 |
iedowse |
Change the second argument of vflush() to an integer that specifies the number of references on the filesystem root vnode to be both expected and released. Many filesystems hold an extra reference on the filesystem root vnode, which must be accounted for when determining if the filesystem is busy and then released if it isn't busy. The old `skipvp' approach required individual filesystem xxx_unmount functions to re-implement much of vflush()'s logic to deal with the root vnode.
All 9 filesystems that hold an extra reference on the root vnode got the logic wrong in the case of forced unmounts, so `umount -f' would always fail if there were any extra root vnode references. Fix this issue centrally in vflush(), now that we can.
This commit also fixes a vnode reference leak in devfs, which could result in idle devfs filesystems that refuse to unmount.
Reviewed by: phk, bp
|
76571 |
14-May-2001 |
phk |
After a successfull poll of the cloning functions, match on the returned dev_t rather than the original name.
This allows cloning from one name to another which is useful for /dev/tty and later for the pty's.
|
76554 |
13-May-2001 |
phk |
Convert DEVFS from an "opt-in" to an "opt-out" option.
If for some reason DEVFS is undesired, the "NODEVFS" option is needed now.
Pending any significant issues, DEVFS will be made mandatory in -current on july 1st so that we can start reaping the full benefits of having it.
|
76491 |
11-May-2001 |
jhb |
GC prototype for procfs_bmap() missed during a previous commit.
|
76320 |
06-May-2001 |
phk |
Remove unneeded devfs_badop()
Noticed by: rwatson
|
76236 |
03-May-2001 |
bp |
Convert vnode_pager_freepage() to vm_free_page().
Forgotten by: alfred
|
76167 |
01-May-2001 |
phk |
Implement vop_std{get|put}pages() and add them to the default vop[].
Un-copy&paste all the VOP_{GET|PUT}PAGES() functions which do nothing but the default.
|
76166 |
01-May-2001 |
markm |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files.
Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files.
Sort sys/*.h includes where possible in affected files.
OK'ed by: bde (with reservations)
|
76160 |
30-Apr-2001 |
phk |
Uncut&paste som bogus use of VOP_BMAP in cd9660::VOP_STRATEGY. XXX mark some stuff which looks like further cut&paste junk.
|
76159 |
30-Apr-2001 |
phk |
Uncut&paste som bogus use of VOP_BMAP in hpfs::VOP_STRATEGY.
At the same time, eliminate uninitialized use of a vnode pointer. Interesting GCC didn't spot this.
|
76146 |
30-Apr-2001 |
bde |
Backed out previous commit. It cause massive filesystem corruption, not to mention a compile-time warning about the critical function becoming unused, by replacing spec_bmap() with vop_stdbmap().
ntfs seems to have the same bug.
The factor for converting specfs block numbers to physical block numbers is 1, but vop_stdbmap() uses the bogus factor btodb(ap->a_vp->v_mount->mnt_stat.f_iosize), which is 16 for ffs with the default block size of 8K. This factor is bogus even for vop_stdbmap() -- the correct factor is related to the filesystem blocksize which is not necessarily the same to the optimal i/o size. vop_stdbmap() was apparently cloned from nfs where these sizes happen to be the same.
There may also be a problem with a_vp->v_mount being null. spec_bmap() still checks for this, but I think the checks in specfs are dead code which used to support block devices.
|
76131 |
29-Apr-2001 |
phk |
Add a vop_stdbmap(), and make it part of the default vop vector.
Make 7 filesystems which don't really know about VOP_BMAP rely on the default vector, rather than more or less complete local vop_nopbmap() implementations.
|
76117 |
29-Apr-2001 |
grog |
Revert consequences of changes to mount.h, part 2.
Requested by: bde
|
75934 |
25-Apr-2001 |
phk |
Move the netexport structure from the fs-specific mountstructure to struct mount.
This makes the "struct netexport *" paramter to the vfs_export and vfs_checkexport interface unneeded.
Consequently that all non-stacking filesystems can use vfs_stdcheckexp().
At the same time, make it a pointer to a struct netexport in struct mount, so that we can remove the bogus AF_MAX and #include <net/radix.h> from <sys/mount.h>
|
75893 |
24-Apr-2001 |
jhb |
Change the pfind() and zpfind() functions to lock the process that they find before releasing the allproc lock and returning.
Reviewed by: -smp, dfr, jake
|
75877 |
23-Apr-2001 |
mjacob |
fix it so it compiles again
|
75874 |
23-Apr-2001 |
mjacob |
add this ridiculous include foo so it will compile again
|
75858 |
23-Apr-2001 |
grog |
Correct #includes to work with fixed sys/mount.h.
|
75856 |
23-Apr-2001 |
grog |
Correct #includes to work with fixed sys/mount.h.
|
75692 |
19-Apr-2001 |
alfred |
vnode_pager_freepage() is really vm_page_free() in disguise, nuke vnode_pager_freepage() and replace all calls to it with vm_page_free()
|
75580 |
17-Apr-2001 |
phk |
This patch removes the VOP_BWRITE() vector.
VOP_BWRITE() was a hack which made it possible for NFS client side to use struct buf with non-bio backing.
This patch takes a more general approach and adds a bp->b_op vector where more methods can be added.
The success of this patch depends on bp->b_op being initialized all relevant places for some value of "relevant" which is not easy to determine. For now the buffers have grown a b_magic element which will make such issues a tiny bit easier to debug.
|
75478 |
13-Apr-2001 |
bp |
Move VT_SMBFS definition to the proper place. Undefine VI_LOCK/VI_UNLOCK.
|
75374 |
10-Apr-2001 |
bp |
Import kernel part of SMB/CIFS requester. Add smbfs(CIFS) filesystem.
Userland part will be in the ports tree for a while.
Obtained from: smbfs-1.3.7-dev package.
|
75295 |
07-Apr-2001 |
des |
Let pseudofs into the warmth of the FreeBSD CVS repo.
It's not finished yet (I still have to find a way to implement process- dependent nodes without consuming too much memory, and the permission system needs tightening up), but it's becoming hard to work on without a repo (I've accidentally almost nuked it once already), and it works (except for the lack of process-dependent nodes, that is).
I was supposed to commit this a week ago, but timed out waiting for jkh to reply to some questions I had. Pass him a spoonful of bad karma :)
|
74996 |
29-Mar-2001 |
jhb |
- Various style fixes. - Fix a silly bug so that we return the actual error code if a procfs attach fails rather than always returning 0.
Reported by: bde
|
74927 |
28-Mar-2001 |
jhb |
Convert the allproc and proctree locks from lockmgr locks to sx locks.
|
74914 |
28-Mar-2001 |
jhb |
Catch up to header include changes: - <sys/mutex.h> now requires <sys/systm.h> - <sys/mutex.h> and <sys/sx.h> now require <sys/lock.h>
|
74810 |
26-Mar-2001 |
phk |
Send the remains (such as I have located) of "block major numbers" to the bit-bucket.
|
74637 |
22-Mar-2001 |
bp |
Add dependancy on libmchain module.
Spotted by: Andrzej Tobola <san@iem.pw.edu.pl>
|
74273 |
15-Mar-2001 |
rwatson |
o Change the API and ABI of the Extended Attribute kernel interfaces to introduce a new argument, "namespace", rather than relying on a first- character namespace indicator. This is in line with more recent thinking on EA interfaces on various mailing lists, including the posix1e, Linux acl-devel, and trustedbsd-discuss forums. Two namespaces are defined by default, EXTATTR_NAMESPACE_SYSTEM and EXTATTR_NAMESPACE_USER, where the primary distinction lies in the access control model: user EAs are accessible based on the normal MAC and DAC file/directory protections, and system attributes are limited to kernel-originated or appropriately privileged userland requests.
o These API changes occur at several levels: the namespace argument is introduced in the extattr_{get,set}_file() system call interfaces, at the vnode operation level in the vop_{get,set}extattr() interfaces, and in the UFS extended attribute implementation. Changes are also introduced in the VFS extattrctl() interface (system call, VFS, and UFS implementation), where the arguments are modified to include a namespace field, as well as modified to advoid direct access to userspace variables from below the VFS layer (in the style of recent changes to mount by adrian@FreeBSD.org). This required some cleanup and bug fixing regarding VFS locks and the VFS interface, as a vnode pointer may now be optionally submitted to the VFS_EXTATTRCTL() call. Updated documentation for the VFS interface will be committed shortly.
o In the near future, the auto-starting feature will be updated to search two sub-directories to the ".attribute" directory in appropriate file systems: "user" and "system" to locate attributes intended for those namespaces, as the single filename is no longer sufficient to indicate what namespace the attribute is intended for. Until this is committed, all attributes auto-started by UFS will be placed in the EXTATTR_NAMESPACE_SYSTEM namespace.
o The default POSIX.1e attribute names for ACLs and Capabilities have been updated to no longer include the '$' in their filename. As such, if you're using these features, you'll need to rename the attribute backing files to the same names without '$' symbols in front.
o Note that these changes will require changes in userland, which will be committed shortly. These include modifications to the extended attribute utilities, as well as to libutil for new namespace string conversion routines. Once the matching userland changes are committed, a buildworld is recommended to update all the necessary include files and verify that the kernel and userland environments are in sync. Note: If you do not use extended attributes (most people won't), upgrading is not imperative although since the system call API has changed, the new userland extended attribute code will no longer compile with old include files.
o Couple of minor cleanups while I'm there: make more code compilation conditional on FFS_EXTATTR, which should recover a bit of space on kernels running without EA's, as well as update copyright dates.
Obtained from: TrustedBSD Project
|
74105 |
11-Mar-2001 |
sobomax |
Add missed MODULE_VERSION() call, so loading of unicode conversion routine works properly.
Clue beaten in by: des
|
74099 |
11-Mar-2001 |
bp |
Do not kill vnodes after rename. This can cause deadlocks in the deadfs.
Noticed by: Matthew N. Dodd <winter@jurai.net>
|
74096 |
11-Mar-2001 |
bp |
Add a mount time option which slightly relaxes checks for valid Joilet extensions.
PR: kern/23315 Reviewed by: adrian
|
74064 |
10-Mar-2001 |
bp |
Slightly reorganize allocation of new vnode. Use bit NVOLUME to detected vnodes which represent volumes (before it was done via strcmp()). Turn n_refparent into bit in the n_flag field.
|
74062 |
10-Mar-2001 |
bp |
Synch with changes in the NCP requester.
|
73942 |
07-Mar-2001 |
mckusick |
Fixes to track snapshot copy-on-write checking in the specinfo structure rather than assuming that the device vnode would reside in the FFS filesystem (which is obviously a broken assumption with the device filesystem).
|
73929 |
07-Mar-2001 |
jhb |
Grab the process lock while calling psignal and before calling psignal.
|
73920 |
07-Mar-2001 |
jhb |
Proc locking identical to that of linprocfs' vnops except that we hold the proc lock while calling psignal.
|
73919 |
07-Mar-2001 |
jhb |
Protect read to p_pptr with proc lock rather than proctree lock.
|
73918 |
07-Mar-2001 |
jhb |
Proc locking. Lock around psignal() and also ensure both an exclusive proctree lock and the process lock are held when updating p_pptr and p_oppid. When we are just reaading p_pptr we only need the proc lock and not a proctree lock as well.
|
73906 |
07-Mar-2001 |
jhb |
Protect p_flag with the proc lock.
|
73871 |
06-Mar-2001 |
bp |
A name of the file can change while its id stays the same. So, we have to update it as well.
Remove unused function.
|
73383 |
03-Mar-2001 |
dfr |
Remove the copyinstr call which was trying to copy the pathname in from user space. It has already been copied in and mp->mnt_stat.f_mntonname has already been initialised by the caller.
This fixes a panic on the alpha caused by the fact that the variable 'size' wasn't initialised because the call to copyinstr() bailed out with an EFAULT error.
|
73286 |
01-Mar-2001 |
adrian |
Reviewed by: jlemon
An initial tidyup of the mount() syscall and VFS mount code.
This code replaces the earlier work done by jlemon in an attempt to make linux_mount() work.
* the guts of the mount work has been moved into vfs_mount().
* move `type', `path' and `flags' from being userland variables into being kernel variables in vfs_mount(). `data' remains a pointer into userspace.
* Attempt to verify the `type' and `path' strings passed to vfs_mount() aren't too long.
* rework mount() and linux_mount() to take the userland parameters (besides data, as mentioned) and pass kernel variables to vfs_mount(). (linux_mount() already did this, I've just tidied it up a little more.)
* remove the copyin*() stuff for `path'. `data' still requires copyin*() since its a pointer into userland.
* set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each filesystem. This variable is generally initialised with `path', and each filesystem can override it if they want to.
* NOTE: f_mntonname is intiailised with "/" in the case of a root mount.
|
72933 |
23-Feb-2001 |
alfred |
Display the Joliet Extension 'level' in the log message.
PR: kern/24998
|
72786 |
21-Feb-2001 |
rwatson |
o Move per-process jail pointer (p->pr_prison) to inside of the subject credential structure, ucred (cr->cr_prison). o Allow jail inheritence to be a function of credential inheritence. o Abstract prison structure reference counting behind pr_hold() and pr_free(), invoked by the similarly named credential reference management functions, removing this code from per-ABI fork/exit code. o Modify various jail() functions to use struct ucred arguments instead of struct proc arguments. o Introduce jailed() function to determine if a credential is jailed, rather than directly checking pointers all over the place. o Convert PRISON_CHECK() macro to prison_check() function. o Move jail() function prototypes to jail.h. o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the flag in the process flags field itself. o Eliminate that "const" qualifier from suser/p_can/etc to reflect mutex use.
Notes:
o Some further cleanup of the linux/jail code is still required. o It's now possible to consider resolving some of the process vs credential based permission checking confusion in the socket code. o Mutex protection of struct prison is still not present, and is required to protect the reference count plus some fields in the structure.
Reviewed by: freebsd-arch Obtained from: TrustedBSD Project
|
72637 |
18-Feb-2001 |
phk |
Remove a debug printf.
|
72521 |
15-Feb-2001 |
jlemon |
Extend kqueue down to the device layer.
Backwards compatible approach suggested by: peter
|
72435 |
13-Feb-2001 |
sobomax |
Add a hook for loading of a Unicode -> char conversion routine as a kld at a run-time. This is temporary solution until proper kernel Unicode interfaces are in place and as such was purposely designed to be as tiny as possible (3 lines of the code not counting comments). The port with conversion routines for the most popular single-byte languages will be added later today
Reviewed by: bp, "Michael C . Wu" <keichii@iteration.net> Approved by: bp
|
72200 |
09-Feb-2001 |
bmilekic |
Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case.
Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
|
72091 |
06-Feb-2001 |
asmodai |
Fix typo: seperate -> separate.
Seperate does not exist in the english language.
|
72012 |
04-Feb-2001 |
phk |
Another round of the <sys/queue.h> FOREACH transmogriffer.
Created with: sed(1) Reviewed by: md5(1)
|
71999 |
04-Feb-2001 |
phk |
Mechanical change to use <sys/queue.h> macro API instead of fondling implementation details.
Created with: sed(1) Reviewed by: md5(1)
|
71998 |
04-Feb-2001 |
phk |
Use <sys/queue.h> macro API.
|
71993 |
04-Feb-2001 |
phk |
Remove a DIAGNOSTIC check which belongs in <sys/queue.h> if anyplace at all.
|
71945 |
02-Feb-2001 |
phk |
At the point in time where most devices are created, we don't know what time it is because boottime is not yet initialized. Finagle the relevant fields when we get the chance.
|
71936 |
02-Feb-2001 |
phk |
Only superuser can create symlinks. Give symlinks mode 755 by default to avoid triggering alert eyes. (the mode isn't use on symlinks)
|
71858 |
31-Jan-2001 |
peter |
Zap last remaining references to (and a use use of) of simple_locks.
|
71829 |
30-Jan-2001 |
phk |
Add a BUF_KERNPROC() in the BIO_DELETE path.
This seems to fix the problem which md(4) backed filesystems exposed.
|
71822 |
30-Jan-2001 |
phk |
Fix two minor nits.
Existences revealed, but no details offered by: bp
|
71777 |
29-Jan-2001 |
dillon |
This patch reestablishes the spec_fsync() guarentee that synchronous fsyncs, which typically occur during unmounting, will drain all dirty buffers even if it takes multiple passes to do so. The guarentee was mangled by the last patch which solved a problem due to -current disabling interrupts while holding giant (which caused an infinite spin loop waiting for I/O to complete). -stable does not have either patch, but has a similar bug in the original spec_fsync() code which is triggered by a bug in the softupdates umount code, a fix for which will be committed to -current as soon as Kirk stamps it. Then both solutions will be MFC'd to -stable.
-stable currently suffers from a combination of the softupdates bug and a small window of opportunity in the original spec_fsync() code, and -stable also suffers from the spin-loop bug but since interrupts are enabled the spin resolves itself in a few milliseconds.
|
71699 |
27-Jan-2001 |
jhb |
Back out proc locking to protect p_ucred for obtaining additional references along with the actual obtaining of additional references.
|
71576 |
24-Jan-2001 |
jasone |
Convert all simplelocks to mutexes and remove the simplelock implementations.
|
71569 |
24-Jan-2001 |
jhb |
- Catch up to proc flag changes.
|
71509 |
24-Jan-2001 |
jhb |
The lock being destroyed was misnamed, not unused. Add the lockdestroy() back in but with the proper name so that this compiles.
Submitted by: jasone
|
71496 |
24-Jan-2001 |
jhb |
Proc locking to protect p_ucred while we obtain additional references.
|
71482 |
23-Jan-2001 |
jhb |
- Remove unused header include. - Use queue macros.
|
71481 |
23-Jan-2001 |
jhb |
Proc locking to protect p_ucred while we obtain an additional reference.
|
71480 |
23-Jan-2001 |
jhb |
- FreeBSD doesn't have an abortop vnop as far as I can tell, so #ifdef references to the hpf op out. - Remove a lockdestroy() on a non-existent variable.
|
71138 |
17-Jan-2001 |
peter |
Fix breakage unconvered by LINT - dont refer to undefined variables in KASSERT()
|
70833 |
09-Jan-2001 |
wollman |
Delete unused #include <sys/select.h>.
|
70829 |
09-Jan-2001 |
wollman |
Don't compile a dead variable declaration.
|
70536 |
31-Dec-2000 |
phk |
Use macro API to <sys/queue.h>
|
70528 |
30-Dec-2000 |
dillon |
Fix a lockup problem that occurs with 'cvs update'. specfs's fsync can get into the same sort of infinite loop that ffs's fsync used to get into, probably due to background bitmap writes. The solution is the same.
|
70374 |
26-Dec-2000 |
dillon |
This implements a better launder limiting solution. There was a solution in 4.2-REL which I ripped out in -stable and -current when implementing the low-memory handling solution. However, maxlaunder turns out to be the saving grace in certain very heavily loaded systems (e.g. newsreader box). The new algorithm limits the number of pages laundered in the first pageout daemon pass. If that is not sufficient then suceessive will be run without any limit.
Write I/O is now pipelined using two sysctls, vfs.lorunningspace and vfs.hirunningspace. This prevents excessive buffered writes in the disk queues which cause long (multi-second) delays for reads. It leads to more stable (less jerky) and generally faster I/O streaming to disk by allowing required read ops (e.g. for indirect blocks and such) to occur without interrupting the write stream, amoung other things.
NOTE: eventually, filesystem write I/O pipelining needs to be done on a per-device basis. At the moment it is globalized.
|
70317 |
23-Dec-2000 |
jake |
Protect proc.p_pptr and proc.p_children/p_sibling with the proctree_lock.
linprocfs not locked pending response from informal maintainer.
Reviewed by: jhb, -smp@
|
70038 |
15-Dec-2000 |
jhb |
When p_ucred is passed to the venus daemon, first grab the proc lock to protect the p_ucred pointer, obtain a seperate reference to the ucred, release the lock, and then pass in the new ucred reference.
|
69958 |
13-Dec-2000 |
rwatson |
o Tighten restrictions on use of /proc/pid/ctl and move access checks in ctl to using centralized p_can() inter-process access control interface.
Reviewed by: sef
|
69947 |
13-Dec-2000 |
jake |
- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.
|
69798 |
09-Dec-2000 |
des |
Add a module version (so that linprocfs can properly depend on procfs)
|
69781 |
08-Dec-2000 |
dwmalone |
Convert more malloc+bzero to malloc+M_ZERO.
Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>
|
69767 |
08-Dec-2000 |
phk |
staticize.
|
69652 |
06-Dec-2000 |
jhb |
Protect accesses to member of struct proc with the proc lock.
|
69507 |
02-Dec-2000 |
jhb |
Protect p_stat with the sched_lock.
Reviewed by: jake
|
69149 |
25-Nov-2000 |
jlemon |
Update to reflect the disappearance of getsock().
Found by: LINT
|
68870 |
18-Nov-2000 |
bp |
Use vop_defaultop() instead of ntfs_bypass().
PR: kern/22756
|
68708 |
14-Nov-2000 |
mckusick |
Missed conversion of CIRCLEQ => TAILQ for mount list.
|
68505 |
08-Nov-2000 |
eivind |
More paranoia against overflows
|
68295 |
04-Nov-2000 |
bp |
v_interlock is a mutex now, not simple lock.
|
68259 |
02-Nov-2000 |
phk |
Take VBLK devices further out of their missery.
This should fix the panic I introduced in my previous commit on this topic.
|
68199 |
01-Nov-2000 |
eivind |
Fix overflow from jail hostname.
Bug found by: Esa Etelavuori <eetelavu@cc.hut.fi>
|
68186 |
01-Nov-2000 |
eivind |
Give vop_mmap an untimely death. The opportunity to give it a timely death timed out in 1996.
|
67895 |
29-Oct-2000 |
dwmalone |
Make malloc use M_ZERO in some more locations. Don't check for a null pointer if malloc called with M_WAITOK.
Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net> Approved by: bp
|
67893 |
29-Oct-2000 |
phk |
Move suser() and suser_xxx() prototypes and a related #define from <sys/proc.h> to <sys/systm.h>.
Correctly document the #includes needed in the manpage.
Add one now needed #include of <sys/systm.h>. Remove the consequent 48 unused #includes of <sys/proc.h>.
|
67885 |
29-Oct-2000 |
phk |
Weaken a bogus dependency on <sys/proc.h> in <sys/buf.h> by #ifdef'ing the offending inline function (BUF_KERNPROC) on it being #included already.
I'm not sure BUF_KERNPROC() is even the right thing to do or in the right place or implemented the right way (inline vs normal function).
Remove consequently unneeded #includes of <sys/proc.h>
|
67882 |
29-Oct-2000 |
phk |
Remove unneeded #include <sys/proc.h> lines.
|
67441 |
22-Oct-2000 |
bp |
Rev 1.41 was committed from wrong diff, now do it right.
|
67439 |
22-Oct-2000 |
bp |
Release and unlock vnode if resource deadlock detected.
|
67438 |
22-Oct-2000 |
bp |
Update stale comment.
PR: kern/21805
|
67437 |
22-Oct-2000 |
bp |
Remove de_lock field from denode structure and make msdosfs PDIRUNLOCK aware.
|
67145 |
15-Oct-2000 |
bp |
Fix nullfs breakage caused by incomplete migration of v_interlock from simple_lock to mutex.
Reset LK_INTERLOCK flag when interlock released manually.
|
66894 |
09-Oct-2000 |
chris |
o Move from Alfred Perstein's "exclusion" technique of handling special file types to requiring all file types to properly implement fo_stat. This makes any new file type additions much easier as this code no longer has to be modified to accomodate it.
o Instead of using curproc in fdesc_allocvp, pass a `struct proc' pointer as a new fifth parameter.
|
66886 |
09-Oct-2000 |
eivind |
Blow away the v_specmountpoint define, replacing it with what it was defined as (rdev->si_mountpoint)
|
66877 |
09-Oct-2000 |
phk |
Don't hold an extra reference to vnodes. Devfs vnodes are sufficiently cheap to setup that it doesn't really matter that we recycle device vnodes at kleenex speed.
Implement first cut try at killing cloned devices when they are not needed anymore. For now only the bpf driver is involved in this experiment. Cloned devices can set the SI_CHEAPCLONE flag which allows us to destroy_dev() it when the vcount() drops to zero and the vnode is reclaimed. For now it's a requirement that the driver doesn't keep persistent state from close to (re)open.
Some whitespace changes.
|
66701 |
05-Oct-2000 |
alfred |
return correct type for process directory entries, DT_DIR not DT_REG
|
66673 |
05-Oct-2000 |
bde |
Forward-declare struct mbuf so that this file is less self-insufficient -- don't depend on garbage in <sys/mount.h>. mbufs aren't actually used here either. They should have been completely removed from filesystem interfaces when they were removed from the interfaces to convert between file handles and vnodes.
|
66615 |
04-Oct-2000 |
jasone |
Convert lockmgr locks from using simple locks to using mutexes.
Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
|
66571 |
03-Oct-2000 |
bp |
Make cd9660 filesystem PDIRUNLOCK aware. Now it can be used in vnode stacks and nullfs mounts.
Remove now unnecessary i_lock field from the iso_node structure.
|
66570 |
03-Oct-2000 |
bp |
Prevent dereference of NULL pointer when null_lock() and null_unlock() called and there is no underlying vnode.
|
66540 |
02-Oct-2000 |
bp |
Protect hash data with lock manager instead of home grown one.
Replace shared lock on vnode with exclusive one. It shouldn't impact perfomance as NCP protocol doesn't support outstanding requests.
Do not hold simple lock on vnode for long period of time.
Add functionality to the nwfs_print() routine.
|
66539 |
02-Oct-2000 |
bp |
Get rid from the legacy __P() macro. Remove 'register' keywords.
|
66524 |
02-Oct-2000 |
peter |
PDIRUNLOCK now exists on FreeBSD. Remove the (now incorrect) redefinition.
|
66356 |
25-Sep-2000 |
bp |
Fix vnode locking bugs in the nullfs. Add correct support for v_object management, so mmap() operation should work properly. Add support for extattrctl() routine (submitted by semenu).
At this point nullfs can be considered as functional and much more stable. In fact, it should behave as a "hard" "symlink" to underlying filesystem.
Reviewed in general by: mckusick, dillon Parts of logic obtained from: NetBSD
|
66028 |
18-Sep-2000 |
phk |
Ignore attempts to set flags to zero. This quenches a syslog warning from login(1).
|
65920 |
16-Sep-2000 |
phk |
Add canonical checks to devfs_setattr().
|
65788 |
12-Sep-2000 |
jhb |
Use size_t instead of u_int for 4th argument to copyinstr().
|
65557 |
07-Sep-2000 |
jasone |
Major update to the way synchronization is done in the kernel. Highlights include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be preempted (i386 only).
Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
|
65515 |
06-Sep-2000 |
phk |
Add refcounts to the "global" DEVFS inode slots, this allows us to recycle inodes after a destroy_dev() but not until all mounts have picked up the change.
Add support for an overflow table for DEVFS inodes. The static table defaults to 1024 inodes, if that fills, an overflow table of 32k inodes is allocated. Both numbers can be changed at compile time, the size of the overflow table also with the sysctl vfs.devfs.noverflow.
Use atomic instructions to barrier between make_dev()/destroy_dev() and the mounts.
Add lockmgr() locking of directories for operations accessing or modifying the directory TAILQs.
Various nitpicking here and there.
|
65467 |
05-Sep-2000 |
bp |
Various cleanups towards make nullfs functional (it is still broken at this point):
Replace all '#ifdef DEBUG' with '#ifdef NULLFS_DEBUG' and add NULLFSDEBUG macro.
Protect nullfs hash table with lockmgr.
Use proper order of operations when freeing mnt_data.
Return correct fsid in the null_getattr().
Add null_open() function to catch MNT_NODEV (obtained from NetBSD).
Add null_rename() to catch cross-fs rename operations (submitted by Ustimenko Semen <semen@iclub.nsu.ru>)
Remove duplicate $FreeBSD$ tags.
|
65464 |
05-Sep-2000 |
bp |
Get rid from the __P() macros.
Encouraged by: peter
|
65447 |
04-Sep-2000 |
phk |
Off by one error.
Submitted by: des
|
65445 |
04-Sep-2000 |
des |
Remove a comment that has been not only obsolete but patently wrong for the last 31 revisions (almost three years).
|
65374 |
02-Sep-2000 |
phk |
Avoid the modules madness I inadvertently introduced by making the cloning infrastructure standard in kern_conf. Modules are now the same with or without devfs support.
If you need to detect if devfs is present, in modules or elsewhere, check the integer variable "devfs_present".
This happily removes an ugly hack from kern/vfs_conf.c.
This forces a rename of the eventhandler and the standard clone helper function.
Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include like <sys/queue.h>
Remove all #includes of opt_devfs.h they no longer matter.
|
65339 |
01-Sep-2000 |
rwatson |
o Simplify if/then clause equating ESRCH with ENOENT when hiding a process
Submitted by: des
|
65331 |
01-Sep-2000 |
rwatson |
o Make procfs use vaccess() for procfs_access() DAC and super-user checks, rather than implementing its own {uid,gid,other} checks against vnode mode. Similar change to linprocfs currently under review.
Obtained from: TrustedBSD Project
|
65237 |
30-Aug-2000 |
rwatson |
o Centralize inter-process access control, introducing:
int p_can(p1, p2, operation, privused)
which allows specification of subject process, object process, inter-process operation, and an optional call-by-reference privused flag, allowing the caller to determine if privilege was required for the call to succeed. This allows jail, kern.ps_showallprocs and regular credential-based interaction checks to occur in one block of code. Possible operations are P_CAN_SEE, P_CAN_SCHED, P_CAN_KILL, and P_CAN_DEBUG. p_can currently breaks out as a wrapper to a series of static function checks in kern_prot, which should not be invoked directly.
o Commented out capabilities entries are included for some checks.
o Update most inter-process authorization to make use of p_can() instead of manual checks, PRISON_CHECK(), P_TRESPASS(), and kern.ps_showallprocs.
o Modify suser{,_xxx} to use const arguments, as it no longer modifies process flags due to the disabling of ASU.
o Modify some checks/errors in procfs so that ENOENT is returned instead of ESRCH, further improving concealment of processes that should not be visible to other processes. Also introduce new access checks to improve hiding of processes for procfs_lookup(), procfs_getattr(), procfs_readdir(). Correct a bug reported by bp concerning not handling the CREATE case in procfs_lookup(). Remove volatile flag in procfs that caused apparently spurious qualifier warnigns (approved by bde).
o Add comment noting that ktrace() has not been updated, as its access control checks are different from ptrace(), whereas they should probably be the same. Further discussion should happen on this topic.
Reviewed by: bde, green, phk, freebsd-security, others Approved by: bde Obtained from: TrustedBSD Project
|
65200 |
29-Aug-2000 |
rwatson |
o Restructure vaccess() so as to check for DAC permission to modify the object before falling back on privilege. Make vaccess() accept an additional optional argument, privused, to determine whether privilege was required for vaccess() to return 0. Add commented out capability checks for reference. Rename some variables to make it more clear which modes/uids/etc are associated with the object, and which with the access mode. o Update file system use of vaccess() to pass NULL as the optional privused argument. Once additional patches are applied, suser() will no longer set ASU, so privused will permit passing of privilege information up the stack to the caller.
Reviewed by: bde, green, phk, -security, others Obtained from: TrustedBSD Project
|
65132 |
27-Aug-2000 |
phk |
Reorder vop's alphabetically. Smarter use of devfs_allocv() (from bp@) Introduce devfs_find() ".." fixes to devfs_lookup (from bp@)
|
65118 |
26-Aug-2000 |
phk |
Minor cleanups tp devfs_readdir(); Add devfs_read() for directories. (inspired by bp@)
|
65075 |
25-Aug-2000 |
bde |
Quick fix for msdsofs_write() on alphas and other machines with either longs larger than 32 bits or strict alignment requirements.
pm_fatmask had type u_long, but it must have a type that has precisely 32 bits and this type must be no smaller than int, so that ~pmp->pm_fatmask has no bits above the 31st set. Otherwise, comparisons between (cn | ~pmp->pm_fatmask) and magic 32-bit "cluster" numbers always fail. The correct fix is to use the C99 type uint_least32_t and mask with 0xffffffff. The quick fix is to use u_int32_t and assume that ints have
msdosfs metadata is riddled with unaligned fields, and on alphas, unaligned_fixup() apparently has problems fixing up the unaligned accesses caused by this. The quick fix is to not comment out the NetBSD code that sort of handles this, and define UNALIGNED_ACCESS on i386's so that the code doesn't change on i386's. The correct fix would define UNALIGNED_ACCESS in a central machine-dependent header and maybe add some extra cases to unaligned_fixup(). UNALIGNED_ACCESS is also tested in isofs.
Submitted by: parts by Mark Abene <phiber@radicalmedia.com> PR: 19086
|
65051 |
24-Aug-2000 |
phk |
Fix panic when removing open device (found by bp@) Implement subdirs. Build the full "devicename" for cloning functions. Fix panic when deleted device goes away. Collaps devfs_dir and devfs_dirent structures. Add proper cloning to the /dev/fd* "device-"driver. Fix a bug in make_dev_alias() handling which made aliases appear multiple times. Use devfs_clone to implement getdiskbyname() Make specfs maintain the stat(2) timestamps per dev_t
|
64895 |
21-Aug-2000 |
phk |
Fix devfs_access() bug on directories.
Remove unused #includes.
Bug spotted by: markm
|
64880 |
20-Aug-2000 |
phk |
Remove all traces of Julians DEVFS (incl from kern/subr_diskslice.c)
Remove old DEVFS support fields from dev_t.
Make uid, gid & mode members of dev_t and set them in make_dev().
Use correct uid, gid & mode in make_dev in disk minilayer.
Add support for registering alias names for a dev_t using the new function make_dev_alias(). These will show up as symlinks in DEVFS.
Use makedev() rather than make_dev() for MFSs magic devices to prevent DEVFS from noticing this abuse.
Add a field for DEVFS inode number in dev_t.
Add new DEVFS in fs/devfs.
Add devfs cloning to: disk minilayer (ie: ad(4), sd(4), cd(4) etc etc) md(4), tun(4), bpf(4), fd(4)
If DEVFS add -d flag to /sbin/inits args to make it mount devfs.
Add commented out DEVFS to GENERIC
|
64865 |
20-Aug-2000 |
phk |
Centralize the canonical vop_access user/group/other check in vaccess().
Discussed with: bde
|
64819 |
18-Aug-2000 |
phk |
Introduce vop_stdinactive() and make it the default if no vop_inactive is declared.
Sort and prune a few vop_op[].
|
63962 |
28-Jul-2000 |
sheldonh |
Rename the loadable nullfs kernel module: null -> nullfs
|
63788 |
24-Jul-2000 |
mckusick |
This patch corrects the first round of panics and hangs reported with the new snapshot code.
Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem.
Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write().
Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations.
Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot.
Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation.
Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress.
Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior.
Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic.
Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation.
Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.
|
63141 |
14-Jul-2000 |
dwmalone |
Certain error contitions cause msdosfs_rename() to decrement the vnode reference count on 'fdvp' more times than it should.
PR: 17347 Submitted by: Ian Dowse <iedowse@maths.tcd.ie> Approved by: bde
|
62976 |
11-Jul-2000 |
mckusick |
Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed.
Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
|
62573 |
04-Jul-2000 |
phk |
Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.
Pointed out by: bde
|
62472 |
03-Jul-2000 |
phk |
Pull the rug under block mode devices. they return ENXIO on open(2) now.
|
62454 |
03-Jul-2000 |
phk |
Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:
Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources:
-sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
|
62228 |
29-Jun-2000 |
bp |
Fix memory leakage on module unload.
Spotted by: fixed INVARIANTS code
|
62227 |
29-Jun-2000 |
bp |
Fix memory leakage on module unload.
Spotted by: fixed INVARIANTS code
|
62219 |
28-Jun-2000 |
chris |
fdesc_getattr: Don't fake any file types, just set vap->va_type to IFTOVT(stb.st_mode). If something does not report its mode, vap->va_type is set to VNON accordingly.
|
62184 |
27-Jun-2000 |
alfred |
by changing the logic here we can support dynamic additions of new filetypes.
Reviewed by: green
|
62182 |
27-Jun-2000 |
alfred |
if there are leading zeros fail the lookup
Pointed out by: Alexander Viro <viro@math.psu.edu>
|
62048 |
25-Jun-2000 |
bp |
Remove obsolete comment.
Submitted by: Marius Bendiksen <mbendiks@eunet.no>
|
61884 |
20-Jun-2000 |
chris |
Rename the `VRXEC' macro used to clear read and exec bits to `FDRX' so as not to impede upon VFS namespace.
|
61724 |
16-Jun-2000 |
phk |
Virtualizes & untangles the bioops operations vector.
Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
|
61716 |
15-Jun-2000 |
chris |
Remove unused include <sys/socketvar.h>.
|
61712 |
15-Jun-2000 |
chris |
Replace vattr_null() with VATTR_NULL() and do not explicity set vattr fields to VNOVAL afterwards.
|
61572 |
12-Jun-2000 |
jmb |
before this commit, specfs reported disk partitions using decimal major and minor numbers. "ls -l" reports disk partitions using decimal major numbers and hex minor numbers.
make specfs use decimal major numbers and hex minor numbers, just like "ls -l"
|
61315 |
06-Jun-2000 |
chris |
Instead of completely disallowing VOP_SETATTR, just do it where there is an underlying vnode.
Suggested by: bde
|
61173 |
02-Jun-2000 |
chris |
Update the comment for fdesc_setattr to reflect that we no longer actually setattr() on underlying vnodes.
|
61172 |
02-Jun-2000 |
chris |
- Do not allow VOP_SETATTR to modify underlying vnodes at all. This caused problems when fetch(1) was passed `-o -'. The rationale of this change is that applications attempting to change underlying vnodes for /dev/fd nodes are improperly written and the use of this interface should not ever have been encouraged. Proper alternatives are fchmod, fchown and others.
PR: 18952
- Remove stale, unused fdescnode->fd_link structure member.
|
60938 |
26-May-2000 |
jake |
Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen.
Requested by: msmith and others
|
60833 |
23-May-2000 |
jake |
Change the way that the queue(3) structures are declared; don't assume that the type argument to *_HEAD and *_ENTRY is a struct.
Suggested by: phk Reviewed by: phk Approved by: mdodd
|
60406 |
11-May-2000 |
chris |
Adapt fdesc to be mounted on /dev/fd and remove fd, stdin, stdout and stderr nodes. More specific items of this patch: o Removed support for symbolic links, and the need for fdesc_readlink(). o Put all the code from fdesc_attr() into fdesc_getattr() and removed fdesc_attr(). This also made it easier to properly give all nodes unique inode numbers. o The removal of all non-fd nodes allowed the removal of the fdesc_read(), fdesc_write(), and fdesc_ioctl() nodes, since we no longer have nodes that get special handling. o Correct the component name validity-checking in fdesc_lookup(). It previously detected the end of the string by checking for a terminating NUL, now it uses cnp->cn_namelen. o Handle kqueue files as FIFOs. This is probably the closest file type to represent this type of file there is, and it is unfortunately not very representative of a kqueue. Creation time is not supported by kqueue, so ctime, mtime and atime are all set to the current time when getattr() was called. o Also set st_[mca]time to the current time since there's no data in socket structures that can be used to fill this in (FIFOs). o Simplify fdesc_readdir() since it only has to report the numbered fd nodes. Add `.' and `..' directory links as well. o Remove read bits from directories as they tend to confuse programs like tar(1).
Reviewed by: phk Discussed with: bde (earlier on, not quite review)
|
60281 |
09-May-2000 |
phk |
Change the "bdev-whiner" to whine when open is attempted and extend the deadline a month.
|
60041 |
05-May-2000 |
phk |
Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>.
<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes.
Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data.
Still a few bogus uses of struct buf to track down.
Repocopy by: peter
|
59914 |
03-May-2000 |
phk |
Remove 42 unneeded #include <sys/ioccom.h>.
ioccom.h defines only implementation detail, and should therefore only be included from the #include which defines the ioctl tags, in other words: never include it from *.c
|
59874 |
01-May-2000 |
peter |
Add $FreeBSD$
|
59794 |
30-Apr-2000 |
phk |
Remove unneeded #include <vm/vm_zone.h>
Generated by: src/tools/tools/kerninclude
|
59760 |
29-Apr-2000 |
phk |
Remove unneeded #include <sys/kernel.h>
|
59755 |
29-Apr-2000 |
peter |
nwfs depends on ncp
|
59652 |
26-Apr-2000 |
green |
Move procfs_fullpath() to vfs_cache.c, with a rename to textvp_fullpath(). There's no excuse to have code in synthetic filestores that allows direct references to the textvp anymore.
Feature requested by: msmith Feature agreed to by: warner Move requested by: phk Move agreed to by: bde
|
59522 |
22-Apr-2000 |
green |
Quiet an unused variable warning by commenting out a variable declaration that goes with a commented out statement.
|
59482 |
22-Apr-2000 |
green |
There's no reason to make "file" 0500 rather than 0555.
|
59481 |
22-Apr-2000 |
green |
Welcome back our old friend from procfs, "file"!
|
59391 |
19-Apr-2000 |
phk |
Remove ~25 unneeded #include <sys/conf.h> Remove ~60 unneeded #include <sys/malloc.h>
|
59368 |
18-Apr-2000 |
phk |
Remove unneeded <sys/buf.h> includes.
Due to some interesting cpp tricks in lockmgr, the LINT kernel shrinks by 924 bytes.
|
59288 |
16-Apr-2000 |
jlemon |
Introduce kqueue() and kevent(), a kernel event notification facility.
|
59249 |
15-Apr-2000 |
phk |
Complete the bio/buf divorce for all code below devfs::strategy
Exceptions: Vinum untouched. This means that it cannot be compiled. Greg Lehey is on the case.
CCD not converted yet, casts to struct buf (still safe)
atapi-cd casts to struct buf to examine B_PHYS
|
59241 |
15-Apr-2000 |
rwatson |
Introduce extended attribute support for FFS, allowing arbitrary (name, value) pairs to be associated with inodes. This support is used for ACLs, MAC labels, and Capabilities in the TrustedBSD security extensions, which are currently under development.
In this implementation, attributes are backed to data vnodes in the style of the quota support in FFS. Support for FFS extended attributes may be enabled using the FFS_EXTATTR kernel option (disabled by default). Userland utilities and man pages will be committed in the next batch. VFS interfaces and man pages have been in the repo since 4.0-RELEASE and are unchanged.
o ufs/ufs/extattr.h: UFS-specific extattr defines o ufs/ufs/ufs_extattr.c: bulk of support routines o ufs/{ufs,ffs,mfs}/*.[ch]: hooks and extattr.h includes o contrib/softupdates/ffs_softdep.c: extattr.h includes o conf/options, conf/files, i386/conf/LINT: added FFS_EXTATTR
o coda/coda_vfsops.c: XXX required extattr.h due to ufsmount.h (This should not be the case, and will be fixed in a future commit)
Currently attributes are not supported in MFS. This will be fixed.
Reviewed by: adrian, bp, freebsd-fs, other unthanked souls Obtained from: TrustedBSD Project
|
59034 |
05-Apr-2000 |
bp |
Try to obtain timezone offset from an environment of mount program. This helps in cases where CMOS clock set to UTC time.
|
58934 |
02-Apr-2000 |
phk |
Move B_ERROR flag to b_ioflags and call it BIO_ERROR.
(Much of this done by script)
Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.
Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack.
Add bio_queue field for struct bio aware disksort.
Address a lot of stylistic issues brought up by bde.
|
58706 |
27-Mar-2000 |
dillon |
Commit the buffer cache cleanup patch to 4.x and 5.x. This patch fixes a fragmentation problem due to geteblk() reserving too much space for the buffer and imposes a larger granularity (16K) on KVA reservations for the buffer cache to avoid fragmentation issues. The buffer cache size calculations have been redone to simplify them (fewer defines, better comments, less chance of running out of KVA).
The geteblk() fix solves a performance problem that DG was able reproduce.
This patch does not completely fix the KVA fragmentation problems, but it goes a long way
Mostly Reviewed by: bde and others Approved by: jkh
|
58349 |
20-Mar-2000 |
phk |
Rename the existing BUF_STRATEGY() to DEV_STRATEGY()
substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo)
substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo)
This patch is machine generated except for the ccd.c and buf.h parts.
|
58345 |
20-Mar-2000 |
phk |
Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set.
B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes.
Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL.
Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading.
This change is a step in the direction towards a stackable BIO capability.
A lot of this patch were machine generated (Thanks to style(9) compliance!)
Vinum users: Greg has not had time to test this yet, be careful.
|
58132 |
16-Mar-2000 |
phk |
Eliminate the undocumented, experimental, non-delivering and highly dangerous MAX_PERF option.
|
56674 |
27-Jan-2000 |
nyan |
Supported non-512 bytes/sector format.
PR: misc/12992 Submitted by: chi@bd.mbn.or.jp (Chiharu Shibata) and Dmitrij Tejblum <tejblum@arc.hq.cti.ru> Reviewed by: Dmitrij Tejblum <tejblum@arc.hq.cti.ru>
|
56272 |
19-Jan-2000 |
rwatson |
Fix bde'isms in acl/extattr syscall interface, renaming syscalls to prettier (?) names, adding some const's around here, et al.
Reviewed by: bde
|
56034 |
15-Jan-2000 |
bp |
Check if module was compiled without SMP support and running on an SMP system.
|
56033 |
15-Jan-2000 |
bp |
Add VT_NWFS tag.
|
55991 |
14-Jan-2000 |
bde |
Forward declare some structs so that this header is more self-suifficent.
|
55989 |
14-Jan-2000 |
bde |
Use MALLOC_DECLARE when it is #defined, not when a (wrong) test of __FreeBSD_version succeeds.
|
55765 |
10-Jan-2000 |
phk |
remove check now done in vn_isdisk().
|
55756 |
10-Jan-2000 |
phk |
Give vn_isdisk() a second argument where it can return a suitable errno.
Suggested by: bde
|
55594 |
08-Jan-2000 |
bp |
Treat negative uio_offset value as eof (idea by: bde). Prevent overflows by casting uio_offset to uoff_t. Return correct error number if directory entry is broken.
Reviewed by: bde
|
55311 |
02-Jan-2000 |
phk |
Return ENXIO if there is no device.
|
55308 |
02-Jan-2000 |
bp |
Fix the mess with signed/unsigned longs and ints (inspired by bde). Fix potential bug with directory reading. Explicitly limit file size to 4GB (msdos can't handle larger files). Slightly reorganize msdosfs_read() to reduce number of 'if's.
|
55206 |
29-Dec-1999 |
peter |
Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.
|
55190 |
28-Dec-1999 |
bp |
Avoid to write garbage if uiomove fails.
|
55189 |
28-Dec-1999 |
bp |
Fix an overflow in the msdosfs_read() function which exposed on the files with size > 2GB.
PR: 15639 Submitted by: Tim Kientzle <kientzle@acm.org> Reviewed by: phk
|
55188 |
28-Dec-1999 |
bp |
It is possible that number of sectors specified in the BPB will exceed FAT capacity. This will lead to kernel panic while other systems just limit number of clusters.
PR: 4381, 15136 Reviewed by: phk
|
55153 |
27-Dec-1999 |
peter |
Fix typo "," vs ";"
PR: 15696 Submitted by: Takashi Okumura <taka@cs.pitt.edu>
|
54932 |
21-Dec-1999 |
chris |
Fix a typo that was doing something kind of silly, and that is initializing the creation time for files to the uninitialized value:
vap->va_ctime = vap->va_ctime;
Changed to what was intended, assigning it to the modification time (thus making all three values of access time, modification time and creation time the same thing).
Reviewed by: grog
|
54908 |
20-Dec-1999 |
eivind |
Include vm/vm_extern.h to get at prototypes
|
54803 |
19-Dec-1999 |
rwatson |
Second pass commit to introduce new ACL and Extended Attribute system calls, vnops, vfsops, both in /kern, and to individual file systems that require a vfsop_ array entry.
Reviewed by: eivind
|
54655 |
15-Dec-1999 |
eivind |
Introduce NDFREE (and remove VOP_ABORTOP)
|
54519 |
12-Dec-1999 |
peter |
Fix pointer problem for the Alpha
|
54479 |
12-Dec-1999 |
bp |
Bump local version number to 1.3.4.
|
54444 |
11-Dec-1999 |
eivind |
Lock reporting and assertion changes. * lockstatus() and VOP_ISLOCKED() gets a new process argument and a new return value: LK_EXCLOTHER, when the lock is held exclusively by another process. * The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them * Extend the vnode_if.src format to allow more exact specification than locked/unlocked.
This commit should not do any semantic changes unless you are using DEBUG_VFS_LOCKS.
Discussed with: grog, mch, peter, phk Reviewed by: peter
|
54424 |
11-Dec-1999 |
peter |
Don't simulate a pseudo address-space beyond VM_MAXUSER_ADDRESS that maps onto the upages. We used to use this extensively, particularly for ps and gdb. Both of these have been "fixed". ps gets the p_stats via eproc along with all the other stats, and gdb uses the regs, fpregs etc files.
Once apon a time the UPAGES were mapped here, but that changed back in January '96. This essentially kills my revisions 1.16 and 1.17. The 2-page "hole" above the stack can be reclaimed now.
|
54371 |
09-Dec-1999 |
semenu |
First version of HPFS stuff.
|
54292 |
08-Dec-1999 |
phk |
Remove unused #includes.
Obtained from: http://bogon.freebsd.dk/include
|
54272 |
07-Dec-1999 |
sos |
Commit the kernel part of our DVD support. Nothing much to say really, its just a number of new ioctl's, the rest is done in userland.
|
54095 |
03-Dec-1999 |
semenu |
Merged NetBSD version, as they have done improvements: 1. ntfs_read*attr*() functions now accept uio structure to eliminate one data copying. 2. found and removed deadlock caused by 6 concurent ls -lR. 3. started implementation of nromal Unicode<->unix recodeing.
Obtained from: NetBSD
|
53975 |
01-Dec-1999 |
mckusick |
Collect read and write counts for filesystems. This new code drops the counting in bwrite and puts it all in spec_strategy. I did some tests and verified that the counts collected for writes in spec_strategy is identical to the counts that we previously collected in bwrite. We now also get read counts (async reads come from requests for read-ahead blocks). Note that you need to compile a new version of mount to get the read counts printed out. The old mount binary is completely compatible, the only reason to install a new mount is to get the read counts printed.
Submitted by: Craig A Soules <soules+@andrew.cmu.edu> Reviewed by: Kirk McKusick <mckusick@mckusick.com>
|
53773 |
27-Nov-1999 |
bp |
Remove abuse of struct nameidata.
Pointed by: Eivind Eklund
|
53709 |
26-Nov-1999 |
phk |
Add a sysctl to control if argv is disclosed to the world: kern.ps_argsopen It defaults to 1 which means that all users can see all argvs in ps(1).
Reviewed by: Warner
|
53518 |
21-Nov-1999 |
phk |
Introduce the new function p_trespass(struct proc *p1, struct proc *p2) which returns zero or an errno depending on the legality of p1 trespassing on p2.
Replace kern_sig.c:CANSIGNAL() with call to p_trespass() and one extra signal related check.
Replace procfs.h:CHECKIO() macros with calls to p_trespass().
Only show command lines to process which can trespass on the target process.
|
53509 |
21-Nov-1999 |
bp |
Remove race condition under SMP.
Noted by: Denis Kalinin <denis@mail.rbc.ru>
|
53503 |
21-Nov-1999 |
phk |
s/p_cred->pc_ucred/p_ucred/g
|
53467 |
20-Nov-1999 |
sef |
A process should be able to examine itself.
|
53452 |
20-Nov-1999 |
phk |
struct mountlist and struct mount.mnt_list have no business being a CIRCLEQ. Change them to TAILQ_HEAD and TAILQ_ENTRY respectively.
This removes ugly mp != (void*)&mountlist comparisons.
Requested by: phk Submitted by: Jake Burkholder jake@checker.org PR: 14967
|
53364 |
18-Nov-1999 |
peter |
Fix an unused variable warning.
|
53359 |
18-Nov-1999 |
peter |
Fix a warning.
|
53301 |
17-Nov-1999 |
phk |
Make proc/*/cmdline use the cached argv if available.
Submitted by: Paul Saab <paul@mu.org> Reviewed by: phk
|
53300 |
17-Nov-1999 |
phk |
The function `procfs_getattr()' in procfs doesn't set the value of vap->va_fsid, so we cannot get valid information about procfs.
Submitted by: SAWADA Mizuki miz@pa.aix.or.jp Reviewed by: phk PR: 1654
|
53131 |
13-Nov-1999 |
eivind |
Remove WILLRELE from VOP_SYMLINK
Note: Previous commit to these files (except coda_vnops and devfs_vnops) that claimed to remove WILLRELE from VOP_RENAME actually removed it from VOP_MKNOD.
|
53101 |
12-Nov-1999 |
eivind |
Remove WILLRELE from VOP_RENAME
|
53059 |
09-Nov-1999 |
phk |
Next step in the device cleanup process.
Correctly lock vnodes when calling VOP_OPEN() from filesystem mount code.
Unify spec_open() for bdev and cdev cases.
Remove the disabled bdev specific read/write code.
|
53045 |
09-Nov-1999 |
alc |
Passing "0" or "FALSE" as the fourth argument to vm_fault is wrong. It should be "VM_FAULT_NORMAL".
|
53017 |
08-Nov-1999 |
phk |
remove a confusing and stale comment.
|
53016 |
08-Nov-1999 |
phk |
Oops, a bit too hasty there.
|
53010 |
08-Nov-1999 |
phk |
Various cleanups.
|
52990 |
08-Nov-1999 |
sef |
Explain why Warner is right, and I am wrong, in the removing of the file object. Also explain some possible directions to re-implement it -- I'm not sure it should be, given the minimal application use. (Other than having the debugger automatically access the symbols for a process, the main use I'd found was with some minor accounting ability, but _that_ depends on it being in the filesystem space; an ioctl access method would be useless in that case.)
This is a code-less change; only a comment has been added.
|
52988 |
08-Nov-1999 |
peter |
Update for fileops.fo_stat() addition. Note, this would panic if it saw a DTYPE_PIPE. This isn't quite right but should stop a crash.
|
52971 |
07-Nov-1999 |
phk |
Use vop_panic() instead of spec_badop().
|
52967 |
07-Nov-1999 |
phk |
Remove the iskmemdev() function. Make it the responsibility of the mem.c drivers to enforce the securelevel checks.
|
52961 |
07-Nov-1999 |
sef |
Make an incredibly stupid change because Warner threatened to do it and continue doing it despite objections by me (the principal author).
Note that this doesn't fix the real problem -- the real problem is generally bad setup by ignorant users, and education is the right way to fix it.
So while this doesn't actually solve the prolem mentioned in the complaint (since it's still possible to do it via other methods, although they mostly involve a bit more complicity), and there are better methods to do this, nobody was willing or able to provide me with a real world example that couldn't be worked around using the existing permissions and group mechanism. And therefore, security by removing features is the method of the day.
I only had three applications that used it, in any event. One of them would have made debugging easier, but I still haven't finished it, and won't now, so it doesn't really matter.
|
52814 |
02-Nov-1999 |
archie |
Change structure field named 'toupper' to 'to_upper' to avoid conflict with the macro of the same name. Same thing for 'tolower'.
|
52782 |
01-Nov-1999 |
msmith |
Newline-terminate the complaint message about not being able to find the root vnode pointer.
|
52728 |
01-Nov-1999 |
phk |
Remove specfs::vop_lookup() There is no code path which can call it.
|
52719 |
31-Oct-1999 |
bp |
Bump version number to sync with ncplib 1.3.3
|
52635 |
29-Oct-1999 |
phk |
useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs.
This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
52399 |
20-Oct-1999 |
dillon |
A tentative agreement has been reached in regards to a procedure to remove 'b'lock devices. The agreement is, essentially, that block devices will be collapsed into character devices as a first step (though I don't particularly agree), and raw device names 'rxxx' will become simply 'xxx' in devfs in the second step (i.e. no 'rxxx' names will exist). The renaming will not effect the original /dev and the expectation is that devfs will eventually (but not immediately) become the standard way to access devices in the system.
If it is determined that a reimplementation of block device access characteristics is beneficial, a number of alternatives will be possible that do not involve resurrecting the 'b'lock device class. For example, an ioctl() that might be made on an open character device descriptor or a generic buffered overlay device.
This commit removes the blockdev disablement sysctl which does not apply to the solution that was reached.
|
52385 |
18-Oct-1999 |
phk |
Change the default for the vfs.bdev_buffered sysctl to zero.
This means that access to block devices nodes will act the same as char device nodes for disk-like devices.
If you encounter problems after this, where programs accessing disks directly fail to operate, please use the following command to revert to previous behaviour:
sysctl -w vfs.bdev_buffered=1
And verify that this was indeed the cause of your trouble.
See the mail-archives of the arch@FreeBSD.org list for background.
|
52230 |
14-Oct-1999 |
bp |
Under some condition vnode can reference itself.
|
52229 |
14-Oct-1999 |
bp |
Isolate old constant NCP_VOLNAME_LEN.
|
52152 |
12-Oct-1999 |
bp |
Remove unnessary includes.
|
52137 |
11-Oct-1999 |
phk |
remove unused #includes
|
52034 |
08-Oct-1999 |
phk |
Add a couple of strategic KASSERTs
|
52032 |
08-Oct-1999 |
phk |
Add back sysctl vfs.enable_userblk_io
|
51983 |
07-Oct-1999 |
bp |
Put back cn_namelen initialization. Removed by phk in rev 1.2.
|
51929 |
04-Oct-1999 |
phk |
Warn once per driver about dev_t's not registered with make_dev().
|
51926 |
04-Oct-1999 |
phk |
Move the buffered read/write code out of spec_{read|write} and into two new functions spec_buf{read|write}.
Add sysctl vfs.bdev_buffered which defaults to 1 == true. This sysctl can be used to experimentally turn buffered behaviour for bdevs off. I should not be changed while any blockdevices are open. Remove the misplaced sysctl vfs.enable_userblk_io.
No other changes in behaviour.
|
51906 |
03-Oct-1999 |
phk |
Before we start to mess with the VFS name-cache clean things up a little bit: Isolate the namecache in its own file, and give it a dedicated malloc type.
|
51852 |
02-Oct-1999 |
bp |
Import kernel part of ncplib: netncp and nwfs
Reviewed by: msmith, peter Obtained from: ncplib
|
51808 |
30-Sep-1999 |
phk |
Remove the D_NOCLUSTER[RW] options which were added because vn had problems. Now that Matt has fixed vn, this can go. The vn driver should have used d_maxio (now si_iosize_max) anyway.
|
51797 |
29-Sep-1999 |
phk |
Remove v_maxio from struct vnode.
Replace it with mnt_iosize_max in struct mount.
Nits from: bde
|
51791 |
29-Sep-1999 |
marcel |
sigset_t change (part 2 of 5) -----------------------------
The core of the signalling code has been rewritten to operate on the new sigset_t. No methodological changes have been made. Most references to a sigset_t object are through macros (see signalvar.h) to create a level of abstraction and to provide a basis for further improvements.
The NSIG constant has not been changed to reflect the maximum number of signals possible. The reason is that it breaks programs (especially shells) which assume that all signals have a non-null name in sys_signame. See src/bin/sh/trap.c for an example. Instead _SIG_MAXSIG has been introduced to hold the maximum signal possible with the new sigset_t.
struct sigprop has been moved from signalvar.h to kern_sig.c because a) it is only used there, and b) access must be done though function sigprop(). The latter because the table doesn't holds properties for all signals, but only for the first NSIG signals.
signal.h has been reorganized to make reading easier and to add the new and/or modified structures. The "old" structures are moved to signalvar.h to prevent namespace polution.
Especially the coda filesystem suffers from the change, because it contained lines like (p->p_sigmask == SIGIO), which is easy to do for integral types, but not for compound types.
NOTE: kdump (and port linux_kdump) must be recompiled.
Thanks to Garrett Wollman and Daniel Eischen for pressing the importance of changing sigreturn as well.
|
51747 |
28-Sep-1999 |
dillon |
Make sure file after VOP_OPEN is VMIO'd when transfering control from a lower layer to an upper layer. I'm not sure how necessary this is for reading.
Fix bug in union_lookup() (note: there are probably still several bugs in union_lookup()). This one set lerror as a side effect without setting lowervp, causing copyup code further on down to crash on a null lowervp pointer. Changed the side effect to use a temporary variable instead.
|
51688 |
26-Sep-1999 |
dillon |
This is a major fixup of unionfs. At least 30 serious bugs have been fixed (many due to changing semantics in other parts of the kernel and not the original author's fault), including one critical one: unionfs could cause UFS corruption in the fronting store due to calling VOP_OPEN for writing without turning on vmio for the UFS vnode.
Most of the bugs were related to semantics changes in VOP calls, lock ordering problems (causing deadlocks), improper handling of a read-only backing store (such as an NFS mount), improper referencing and locking of vnodes, not using real struct locks for vnode locking, not using recursive locks when accessing the fronting store, and things like that.
New functionality has been added: unionfs now has mmap() support, but only partially tested, and rename has been enhanced considerably.
There are still some things that unionfs cannot do. You cannot rename a directory without confusing unionfs, and there are issues with softlinks, hardlinks, and special files. unionfs mostly doesn't understand them (and never did).
There are probably still panic situations, but hopefully no where near as many as before this commit.
The unionfs in this commit has been tested overlayed on /usr/src (backing /usr/src being a read-only NFS mount, fronting /usr/src being a local filesystem). kernel builds have been tested, buildworld is undergoing testing. More testing is necessary.
|
51662 |
25-Sep-1999 |
phk |
Remove a warning check which was too general.
|
51658 |
25-Sep-1999 |
phk |
Remove five now unused fields from struct cdevsw. They should never have been there in the first place. A GENERIC kernel shrinks almost 1k.
Add a slightly different safetybelt under nostop for tty drivers.
Add some missing FreeBSD tags
|
51654 |
25-Sep-1999 |
phk |
This patch clears the way for removing a number of tty related fields in struct cdevsw:
d_stop moved to struct tty. d_reset already unused. d_devtotty linkage now provided by dev_t->si_tty.
These fields will be removed from struct cdevsw together with d_params and d_maxio Real Soon Now.
The changes in this patch consist of:
initialize dev->si_tty in *_open() initialize tty->t_stop remove devtotty functions rename ttpoll to ttypoll a few adjustments to these changes in the generic code a bump of __FreeBSD_version add a couple of FreeBSD tags
|
51558 |
22-Sep-1999 |
phk |
Kill the cdevsw->d_maxio field.
d_maxio is replaced by the dev->si_iosize_max field which the driver should be set in all calls to cdevsw->d_open if it has a better idea than the system wide default.
The field is a generic dev_t field (ie: not disk specific) so that tapes and other devices can use physio as well.
|
51486 |
20-Sep-1999 |
dillon |
More removals of vnode->v_lastr, replaced by preexisting seqcount heuristic to detect sequential operation.
VM-related forced clustering code removed from ufs in preparation for a commit to vm/vm_fault.c that does it more generally.
Reviewed by: David Greenman <dg@root.com>, Alan Cox <alc@cs.rice.edu>
|
51485 |
20-Sep-1999 |
dillon |
Fix handling of a device EOF that occurs in the middle of a block. The transfer size calculation was incorrect resulting in the last read being potentially larger then the actual extent of the device.
EOF and write handling has not yet been fixed.
Reviewed by: Tor.Egge@fast.no
|
51479 |
20-Sep-1999 |
phk |
Step one of replacing devsw->d_maxio with si_bsize_max.
Rename dev->si_bsize_max to si_iosize_max and set it in spec_open if the device didn't.
Set vp->v_maxio from dev->si_bsize_max in spec_open rather than in ufs_bmap.c
|
51345 |
17-Sep-1999 |
dillon |
Add vfs.enable_userblk_io sysctl to control whether user reads and writes to buffered block devices are allowed. The default is to be backwards compatible, i.e. reads and writes are allowed.
The idea is for a larger crowd to start running with this disabled and see what problems, if any, crop up, and then to change the default to off and see if any problems crop up in the next 6 months prior to potentially removing support entirely. There are still a few people, Julian and myself included, who believe the buffered block device access from usermode to be useful.
Remove use of vnode->v_lastr from buffered block device I/O in preparation for removal of vnode->v_lastr field, replacing it with the already existing seqcount metric to detect sequential operation.
Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>
|
51138 |
11-Sep-1999 |
alfred |
Seperate the export check in VFS_FHTOVP, exports are now checked via VFS_CHECKEXP.
Add fh(open|stat|stafs) syscalls to allow userland to query filesystems based on (network) filehandle.
Obtained from: NetBSD
|
51111 |
09-Sep-1999 |
julian |
Changes to centralise the default blocksize behaviour. More likely to follow.
Submitted by: phk@freebsd.org
|
51068 |
07-Sep-1999 |
alfred |
All unimplemented VFS ops now have entries in kern/vfs_default.c that return reasonable defaults.
This avoids confusing and ugly casting to eopnotsupp or making dummy functions. Bogus casting of filesystem sysctls to eopnotsupp() have been removed.
This should make *_vfsops.c more readable and reduce bloat.
Reviewed by: msmith, eivind Approved by: phk Tested by: Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>
|
50890 |
04-Sep-1999 |
bde |
Get rid of the NULLFS_DIAGNOSTIC option. This option was as useful as the other XXXFS_DIAGNOSTIC options (not very) and mostly controlled tracing of normal operation. Use `#ifdef DEBUG' for non-diagnostics and `#ifdef DIAGNOSTIC' for diagnostics.
|
50888 |
04-Sep-1999 |
bde |
Fixed the previous change. Some more code controlled by UMAPFS_DIAGNOSTIC is actually for diagnostics; control it with DIAGNOSTIC and not DDB.
|
50839 |
03-Sep-1999 |
julian |
Print out the device name when there is an uninitialised IO size or IO error in spec_getpages().
Submitted by: phk suggested the idea.
|
50835 |
03-Sep-1999 |
julian |
Add a catchall to set default blocksize values for disk like devices.
Submitted by: phk@freebsd.org
|
50830 |
03-Sep-1999 |
julian |
Revert a bunch of contraversial changes by PHK. After a quick think and discussion among various people some form of some of these changes will probably be recommitted.
The reversion requested was requested by dg while discussions proceed. PHK has indicated that he can live with this, and it has been agreed that some form of some of these changes may return shortly after further discussion.
|
50752 |
01-Sep-1999 |
phk |
Fix the sense of the vn_isdisk() check.
|
50715 |
31-Aug-1999 |
phk |
Set the buffersize for non BSDFFS labeled partitions to max(dev->si_bsize_phys, BLKDEV_IOSIZE).
Requested by: davidg
|
50714 |
31-Aug-1999 |
phk |
Make buffered acces to bdevs from userland controllable with a sysctl vfs.bdev_access.
|
50623 |
30-Aug-1999 |
phk |
Make bdev userland access work like cdev userland access unless the highly non-recommended option ALLOW_BDEV_ACCESS is used.
(bdev access is evil because you don't get write errors reported.)
Kill si_bsize_best before it kills Matt :-)
Use the specfs routines rather having cloned copies in devfs.
|
50616 |
30-Aug-1999 |
bde |
Converted the silly SAFTEY option into a new-style option by renaming it to DIAGNOSTIC.
Fixed an English style bug in the panic messages controlled by SAFETY.
|
50554 |
29-Aug-1999 |
bde |
Changed old-style option UNION_DIAGNOSTIC to DEBUG and fixed printf format errors exposed by this. It has nothing to do with diagnostics since it does little more than control tracing of normal operation. Actual diagnostics for the union file system are still controlled by the DIAGNOSTIC option.
|
50553 |
29-Aug-1999 |
bde |
Changed old-style options UMAPFS_DIAGNOSTIC and UMAP_DIAGNOSTIC to DEBUG or DDB and fixed printf format errors exposed by this. The options had little to do with diagnostics; they mostly controlled tracing of normal operation.
|
50523 |
28-Aug-1999 |
phk |
Fix various trivial warnings from LINT
|
50477 |
28-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
50405 |
26-Aug-1999 |
phk |
Simplify the handling of VCHR and VBLK vnodes using the new dev_t:
Make the alias list a SLIST.
Drop the "fast recycling" optimization of vnodes (including the returning of a prexisting but stale vnode from checkalias). It doesn't buy us anything now that we don't hardlimit vnodes anymore.
Rename checkalias2() and checkalias() to addalias() and addaliasu() - which takes dev_t and udev_t arg respectively.
Make the revoke syscalls use vcount() instead of VALIASED.
Remove VALIASED flag, we don't need it now and it is faster to traverse the much shorter lists than to maintain the flag.
vfs_mountedon() can check the dev_t directly, all the vnodes point to the same one.
Print the devicename in specfs/vprint().
Remove a couple of stale LFS vnode flags.
Remove unimplemented/unused LK_DRAINED;
|
50347 |
25-Aug-1999 |
phk |
Introduce vn_isdisk(struct vnode *vp) function, and use it to test for diskness.
|
50327 |
25-Aug-1999 |
julian |
Fix comment to match reality.. vop_strategy gets a vnode argument these days.
|
50256 |
23-Aug-1999 |
bde |
Initialise fsids with (user) device numbers again. Bitrot when dev_t's were changed to pointers was obscured by casting dev_t's to longs. fsids haven't even been comprised of longs since the Lite2 merge.
|
50254 |
23-Aug-1999 |
phk |
Convert DEVFS hooks in (most) drivers to make_dev().
Diskslice/label code not yet handled.
Vinum, i4b, alpha, pc98 not dealt with (left to respective Maintainers)
Add the correct hook for devfs to kern_conf.c
The net result of this excercise is that a lot less files depends on DEVFS, and devtoname() gets more sensible output in many cases.
A few drivers had minor additional cleanups performed relating to cdevsw registration.
A few drivers don't register a cdevsw{} anymore, but only use make_dev().
|
50061 |
19-Aug-1999 |
marcel |
Let processes retrieve their argv through procfs. Revert to the original behaviour in all other cases.
Submitted by: Andrew Gordon <arg@arg1.demon.co.uk>
|
49945 |
17-Aug-1999 |
alc |
Add the (inline) function vm_page_undirty for clearing the dirty bitmask of a vm_page.
Use it.
Submitted by: dillon
|
49771 |
14-Aug-1999 |
phk |
Spring cleaning around strategy and disklabels/slices:
Introduce BUF_STRATEGY(struct buf *, int flag) macro, and use it throughout. please see comment in sys/conf.h about the flag argument.
Remove strategy argument from all the diskslice/label/bad144 implementations, it should be found from the dev_t.
Remove bogus and unused strategy1 routines.
Remove open/close arguments from dssize(). Pick them up from dev_t.
Remove unused and unfinished setgeom support from diskslice/label/bad144 code.
|
49695 |
13-Aug-1999 |
phk |
Add support for device drivers which want to track all open/close operations. This allows a device driver better insight into what is going on that the current:
proc1: open /dev/foo R/O devsw->open( R/O, proc1 ) proc2: open /dev/foo R/W devsw->open( R/W, proc2 ) proc2: close /* nothing, but device is really only R/O open */ proc1: close devsw->close( R/O, proc1 )
|
49687 |
13-Aug-1999 |
phk |
Don't examine vp->v_tag (see comment in vnode.h)
|
49681 |
13-Aug-1999 |
phk |
Remove spec_getattr(), which as far as I can tell can never be called from the current code-paths, and if it were, would panic on any unmounted bdev.
|
49679 |
13-Aug-1999 |
phk |
The bdevsw() and cdevsw() are now identical, so kill the former.
|
49678 |
13-Aug-1999 |
phk |
s/v_specinfo/v_rdev/
|
49535 |
08-Aug-1999 |
phk |
Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>, a few lines into <sys/vnode.h>.
Add a few fields to struct specinfo, paving the way for the fun part.
|
49525 |
08-Aug-1999 |
bde |
Fixed printf format errors (%qu -> %llu; the arg was already unsigned long long to hide problems on alphas).
|
49524 |
08-Aug-1999 |
bde |
Fixed all printf format errors reported by gcc -Wformat on i386's: - %q -> %ll; don't assume that the promotion of off_t is quad_t; only assume that off_t's are representable as long longs. - printing of dev_t's was completely broken.
Fixed nearby printf format errors not reported by gcc -Wformat on i386's: - printing of ino_t's and pointers was sloppy.
|
49383 |
02-Aug-1999 |
rvb |
The dev returned here is what is found in the st_dev field. This should not be further translated ... hence the 0.
|
49075 |
25-Jul-1999 |
bde |
Don't set DE_ACCESS for unsuccessful reads. Translated from: a similar fix in ufs_readwrite.c rev.1.61.
Don't forget to set DE_ACCESS for short reads.
Check for invalid (negative) offsets before checking for reads of 0 bytes, as in ufs, although checking for invalid offsets at all is probably a bug.
|
48960 |
21-Jul-1999 |
phk |
Remove the RCS "Log" and all the verbiage it has generated.
|
48936 |
20-Jul-1999 |
phk |
Now a dev_t is a pointer to struct specinfo which is shared by all specdev vnodes referencing this device.
Details: cdevsw->d_parms has been removed, the specinfo is available now (== dev_t) and the driver should modify it directly when applicable, and the only driver doing so, does so: vn.c. I am not sure the logic in checking for "<" was right before, and it looks even less so now.
An intial pool of 50 struct specinfo are depleted during early boot, after that malloc had better work. It is likely that fewer than 50 would do.
Hashing is done from udev_t to dev_t with a prime number remainder hash, experiments show no better hash available for decent cost (MD5 is only marginally better) The prime number used should not be close to a power of two, we use 83 for now.
Add new checkalias2() to get around the loss of info from dev2udev() in bdevvp();
The aliased vnodes are hung on a list straight of the dev_t, and speclisth[SPECSZ] is unused. The sharing of struct specinfo means that the v_specnext moves into the vnode which grows by 4 bytes.
Don't use a VBLK dev_t which doesn't make sense in MFS, now we hang a dummy cdevsw on B/Cmaj 253 so that things look sane.
Storage overhead from all of this is O(50k).
Bump __FreeBSD_version to 400009
The next step will add the stuff needed so device-drivers can start to hang things from struct specinfo
|
48926 |
20-Jul-1999 |
phk |
Don't access the device with vp->v_specinfo->si_rdev, use vp->v_rdev.
|
48859 |
17-Jul-1999 |
phk |
I have not one single time remembered the name of this function correctly so obviously I gave it the wrong name. s/umakedev/makeudev/g
|
48719 |
09-Jul-1999 |
phk |
Allow jailed proccesses to open non-process vnodes like the root of the fs.
|
48715 |
09-Jul-1999 |
peter |
Use %q rather than rolling a custom routine.
|
48692 |
09-Jul-1999 |
jlemon |
Support for i386 hardware breakpoints.
Submitted by: Brian Dean <brdean@unx.sas.com>
|
48691 |
09-Jul-1999 |
jlemon |
Implement support for hardware debug registers on the i386.
Submitted by: Brian Dean <brdean@unx.sas.com>
|
48468 |
02-Jul-1999 |
phk |
Make sure that stat(2) and friends always return a valid st_dev field.
Pseudo-FS need not fill in the va_fsid anymore, the syscall code will use the first half of the fsid, which now looks like a udev_t with major 255.
|
48425 |
01-Jul-1999 |
peter |
move <sys/systm.h> before <sys/buf.h>
|
48225 |
26-Jun-1999 |
mckusick |
Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.
|
47964 |
16-Jun-1999 |
mckusick |
Add a vnode argument to VOP_BWRITE to get rid of the last vnode operator special case. Delete special case code from vnode_if.sh, vnode_if.src, umap_vnops.c, and null_vnops.c.
|
47897 |
13-Jun-1999 |
phk |
Eliminate the bogus procfs private almost struct dirent structure.
Spotted by: Lars Hamren Reviewed by: bde
|
47686 |
01-Jun-1999 |
dt |
Remove an unused variable.
|
47640 |
31-May-1999 |
phk |
Simplify cdevsw registration.
The cdevsw_add() function now finds the major number(s) in the struct cdevsw passed to it. cdevsw_add_generic() is no longer needed, cdevsw_add() does the same thing.
cdevsw_add() will print an message if the d_maj field looks bogus.
Remove nblkdev and nchrdev variables. Most places they were used bogusly. Instead check a dev_t for validity by seeing if devsw() or bdevsw() returns NULL.
Move bdevsw() and devsw() functions to kern/kern_conf.c
Bump __FreeBSD_version to 400006
This commit removes: 72 bogus makedev() calls 26 bogus SYSINIT functions
if_xe.c bogusly accessed cdevsw[], author/maintainer please fix.
I4b and vinum not changed. Patches emailed to authors. LINT probably broken until they catch up.
|
47625 |
30-May-1999 |
phk |
This commit should be a extensive NO-OP:
Reformat and initialize correctly all "struct cdevsw".
Initialize the d_maj and d_bmaj fields.
The d_reset field was not removed, although it is never used.
I used a program to do most of this, so all the files now use the same consistent format. Please keep it that way.
Vinum and i4b not modified, patches emailed to respective authors.
|
47407 |
22-May-1999 |
dt |
Don't call calcru() on a swapped-out process. calcru() access p_stats, which is in U-area.
|
47060 |
12-May-1999 |
semenu |
Driver is now ported to NetBSD.
Submitted by: Christos Zoulas <christos@zoulas.com>
|
47028 |
11-May-1999 |
phk |
Divorce "dev_t" from the "major|minor" bitmap, which is now called udev_t in the kernel but still called dev_t in userland.
Provide functions to manipulate both types: major() umajor() minor() uminor() makedev() umakedev() dev2udev() udev2dev()
For now they're functions, they will become in-line functions after one of the next two steps in this process.
Return major/minor/makedev to macro-hood for userland.
Register a name in cdevsw[] for the "filedescriptor" driver.
In the kernel the udev_t appears in places where we have the major/minor number combination, (ie: a potential device: we may not have the driver nor the device), like in inodes, vattr, cdevsw registration and so on, whereas the dev_t appears where we carry around a reference to a actual device.
In the future the cdevsw and the aliased-from vnode will be hung directly from the dev_t, along with up to two softc pointers for the device driver and a few houskeeping bits. This will essentially replace the current "alias" check code (same buck, bigger bang).
A little stunt has been provided to try to catch places where the wrong type is being used (dev_t vs udev_t), if you see something not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if it makes a difference. If it does, please try to track it down (many hands make light work) or at least try to reproduce it as simply as possible, and describe how to do that.
Without DEVT_FASCIST I belive this patch is a no-op.
Stylistic/posixoid comments about the userland view of the <sys/*.h> files welcome now, from userland they now contain the end result.
Next planned step: make all dev_t's refer to the same devsw[] which means convert BLK's to CHR's at the perimeter of the vnodes and other places where they enter the game (bootdev, mknod, sysctl).
|
46795 |
09-May-1999 |
phk |
remove cast from dev_t to dev_t.
|
46676 |
08-May-1999 |
phk |
I got tired of seeing all the cdevsw[major(foo)] all over the place.
Made a new (inline) function devsw(dev_t dev) and substituted it.
Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)
DEVFS will eventually benefit from this change too.
|
46669 |
08-May-1999 |
dcs |
The lowercasing of Joliet filenames was not a feature.
|
46635 |
07-May-1999 |
phk |
Continue where Julian left off in July 1998:
Virtualize bdevsw[] from cdevsw. bdevsw() is now an (inline) function.
Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention to the order of the cmaj/bmaj arguments!)
Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE (ditto!)
(Next step will be to convert all bdev dev_t's to cdev dev_t's before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)
|
46593 |
06-May-1999 |
peter |
One too many vfsops..
|
46580 |
06-May-1999 |
phk |
remove b_proc from struct buf, it's (now) unused.
Reviewed by: dillon, bde
|
46568 |
06-May-1999 |
peter |
Add sufficient braces to keep egcs happy about potentially ambiguous if/else nesting.
|
46389 |
04-May-1999 |
phk |
Make the type and map files claim 0 bytes size. Tar doesn't get confused now, but doesn't store any data eiter.
I wonder if we shouldn't claim to be fifos instead...
|
46388 |
04-May-1999 |
phk |
Add even more () to CHECKIO which by now feels positively LISPish.
Submitted by: bde Reviewed by: phk
|
46201 |
30-Apr-1999 |
phk |
Add a new "file" to procfs: "rlimit" which shows the resource limits for the process.
PR: 11342 Submitted by: Adrian Chadd adrian@freebsd.org Reviewed by: phk
|
46155 |
28-Apr-1999 |
phk |
This Implements the mumbled about "Jail" feature.
This is a seriously beefed up chroot kind of thing. The process is jailed along the same lines as a chroot does it, but with additional tough restrictions imposed on what the superuser can do.
For all I know, it is safe to hand over the root bit inside a prison to the customer living in that prison, this is what it was developed for in fact: "real virtual servers".
Each prison has an ip number associated with it, which all IP communications will be coerced to use and each prison has its own hostname.
Needless to say, you need more RAM this way, but the advantage is that each customer can run their own particular version of apache and not stomp on the toes of their neighbors.
It generally does what one would expect, but setting up a jail still takes a little knowledge.
A few notes:
I have no scripts for setting up a jail, don't ask me for them.
The IP number should be an alias on one of the interfaces.
mount a /proc in each jail, it will make ps more useable.
/proc/<pid>/status tells the hostname of the prison for jailed processes.
Quotas are only sensible if you have a mountpoint per prison.
There are no privisions for stopping resource-hogging.
Some "#ifdef INET" and similar may be missing (send patches!)
If somebody wants to take it from here and develop it into more of a "virtual machine" they should be most welcome!
Tools, comments, patches & documentation most welcome.
Have fun...
Sponsored by: http://www.rndassociates.com/ Run for almost a year by: http://www.servetheweb.com/
|
46116 |
27-Apr-1999 |
phk |
Change suser_xxx() to suser() where it applies.
|
46112 |
27-Apr-1999 |
phk |
Suser() simplification:
1: s/suser/suser_xxx/
2: Add new function: suser(struct proc *), prototyped in <sys/proc.h>.
3: s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/
The remaining suser_xxx() calls will be scrutinized and dealt with later.
There may be some unneeded #include <sys/cred.h>, but they are left as an exercise for Bruce.
More changes to the suser() API will come along with the "jail" code.
|
45879 |
20-Apr-1999 |
semenu |
Removed annoying messaged during boot,added some check before mounting (should help to do not mount extended partitions:-). Fixed problem with hanging while unmounting busy fs.
And (the most important) added some locks to prevent simulaneous access to kernel structures!
|
45773 |
18-Apr-1999 |
dcs |
Add support for Joliet extensions to the iso9660 fs. The related PR cannot yet be closed, though.
I hope I got all credits right, and that the multiple submitted by lines do not break anyone's scripts...
PR: kern/5038, kern/5567 Submitted by: Keith Jang <keith@email.gcn.net.tw> Submitted by: Joachim Kuebart <joki@kuebart.stuttgart.netsurf.de> Submitted by: Byung Yang <byung@wam.umd.edu> Submitted by: Motomichi Matsuzaki <mzaki@e-mail.ne.jp>
|
45653 |
13-Apr-1999 |
semenu |
Removed DIAGNOSTIC opion redefinition.
Submitted by: Eivind Eklund <eivind@FreeBSD.org>
|
45347 |
05-Apr-1999 |
julian |
Catch a case spotted by Tor where files mmapped could leave garbage in the unallocated parts of the last page when the file ended on a frag but not a page boundary. Delimitted by tags PRE_MATT_MMAP_EOF and POST_MATT_MMAP_EOF, in files alpha/alpha/pmap.c i386/i386/pmap.c nfs/nfs_bio.c vm/pmap.h vm/vm_page.c vm/vm_page.h vm/vnode_pager.c miscfs/specfs/spec_vnops.c ufs/ufs/ufs_readwrite.c kern/vfs_bio.c
Submitted by: Matt Dillon <dillon@freebsd.org> Reviewed by: Alan Cox <alc@freebsd.org>
|
45098 |
28-Mar-1999 |
dt |
Back out half of 1.32: don't print a message on every failed mount attempt. It is too chatty and hardly useful. 2 mesages in somewhat usual cases are left for now.
|
44693 |
12-Mar-1999 |
imp |
Don't allow anyone except root to mount file systems that map uids. This can have bad security implications, but the impact on FreeBSD systems is minimal because this fs isn't in the default kernels and it is unknown if it even works.
Submitted by: Manuel Bouyer <bouyer@antioche.eu.org> and Artur Grabowski <art@stacken.kth.se>
|
44329 |
28-Feb-1999 |
peter |
This code got moved as a result of confusion between union mounts and unionfs. Julian has already revived the union mount part of this move in vfs_syscalls.c rev 1.119, but forgot to take it out of here.
|
44247 |
25-Feb-1999 |
dillon |
Reviewed by: Julian Elischer <julian@whistle.com>
Add d_parms() to {c,b}devsw[]. If non-NULL this function points to a device routine that will properly fill in the specinfo structure. vfs_subr.c's checkalias() supplies appropriate defaults. This change should be fully backwards compatible with existing devices.
|
44146 |
19-Feb-1999 |
luoqi |
Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper.
Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>
|
44142 |
19-Feb-1999 |
semenu |
Added limited write ability. Now we can use some kind of files for swap holders. See mount_ntfs..8 for details.
|
43748 |
07-Feb-1999 |
dillon |
Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to attempt to optimize forks but were essentially given-up on due to problems and replaced with an explicit dup of the vm_map_entry structure. Prior to the removal, they were entirely unused.
|
43634 |
05-Feb-1999 |
jdp |
Correct a format mismatch on 64-bit architectures. This should fix the erroneous values in the procfs "map" file on the Alpha.
|
43552 |
03-Feb-1999 |
semenu |
First version. Reviewed by: David O'Brien <obrien@NUXI.com>
|
43461 |
31-Jan-1999 |
bde |
Don't comment out dead code; remove it.
|
43427 |
30-Jan-1999 |
phk |
Use suser() to determine super-user-ness. Don't pretend we can mount RW.
Reviewed by: bde
|
43382 |
29-Jan-1999 |
bde |
Removed a bogus cast to c_caddr_t. This is part of terminating c_caddr_t with extreme prejudice. Here we want to convert from `const char *' to `const char *'. Casting through c_caddr_t is not the way to do this. The original cast to caddr_t was apparently to break warnings about const mismatches in other versions of BSD (in 4.4BSDLite2, the conversion is from `const char *path' to plain caddr_t).
|
43311 |
28-Jan-1999 |
dillon |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
43309 |
27-Jan-1999 |
dillon |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile.
This commit includes significant work to proper handle const arguments for the DDB symbol routines.
|
43305 |
27-Jan-1999 |
dillon |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
43301 |
27-Jan-1999 |
dillon |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
43295 |
27-Jan-1999 |
dillon |
Fix warnings preparing for -Wall -Wcast-qual
Also disable one usb module in LINT due to fatal compilation errors, temporary.
|
42957 |
21-Jan-1999 |
dillon |
This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues.
Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
|
42900 |
20-Jan-1999 |
eivind |
Add 'options DEBUG_LOCKS', which stores extra information in struct lock, and add some macros and function parameters to make sure that the information get to the point where it can be put in the lock structure.
While I'm here, add DEBUG_VFS_LOCKS to LINT.
|
42770 |
17-Jan-1999 |
peter |
Missed a stray LKM #ifdef
|
42768 |
17-Jan-1999 |
peter |
Mountroot could concievably make sense to a KLD though, in the preload case. I'm not sure the autoconf code is up to it though...
|
42763 |
17-Jan-1999 |
peter |
Clean up the KLD/LKM goop a bit.
|
42568 |
12-Jan-1999 |
eivind |
Remove declarations for undefined functions and a couple of unused enotsupp implementations.
|
42374 |
07-Jan-1999 |
bde |
Don't pass unused unused timestamp args to UFS_UPDATE() or waste time initializing them. This almost finishes centralizing (in-core) timestamp updates in ufs_itimes().
|
42315 |
05-Jan-1999 |
eivind |
Remove the 'waslocked' parameter to vfs_object_create().
|
42301 |
05-Jan-1999 |
peter |
A partial implementation of the procfs cmdline pseudo-file. This is enough to satisfy things like StarOffice. This is a hack, but doing it properly would be a LOT of work, and would require extensive grovelling around in the user address space to find the argv[].
Obtained from: Mostly from Andrzej Bialecki <abial@nask.pl>.
|
42252 |
02-Jan-1999 |
dt |
Now empty DOS filesystems default to long file names. Non-empty filesystems without traces of Win95 default to short file names, as before.
|
42249 |
02-Jan-1999 |
dt |
Ensure that deHighClust in direntry always initialized.
Noticed by: Carl Mascott <cmascott@world.std.com>
Don't write access time of a file more than once per day. (Its precision is 1 day anyway). Don't try to write access and creation time in nonwin95 case.
Suggested by: bde (long time ago).
|
42248 |
02-Jan-1999 |
bde |
Ifdefed conditionally used simplock variables.
|
42227 |
01-Jan-1999 |
bde |
Made this compile if UMAPFS_DIAGNOSTIC is defined. This has been broken since before rev.1.1, so UMAPFS_DIAGNOSTIC should not be trusted. UMAPFS_DIAGNOSTIC is commented out in LINT to hide various bugs.
|
41836 |
16-Dec-1998 |
eivind |
Fix possible NULL-pointer deref in error case (same as DEVFS).
|
41761 |
14-Dec-1998 |
dillon |
Cleanup uninitialized-possibly-used (but really not) warnings
|
41591 |
07-Dec-1998 |
archie |
The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.
|
41570 |
07-Dec-1998 |
eivind |
'\0' is the most ugly NULL pointer constant I've ever seen in real code.
|
41560 |
06-Dec-1998 |
jkh |
MFC: loosen compare even though bde doesn't like it.
|
41514 |
04-Dec-1998 |
archie |
Examine all occurrences of sprintf(), strcat(), and str[n]cpy() for possible buffer overflow problems. Replaced most sprintf()'s with snprintf(); for others cases, added terminating NUL bytes where appropriate, replaced constants like "16" with sizeof(), etc.
These changes include several bug fixes, but most changes are for maintainability's sake. Any instance where it wasn't "immediately obvious" that a buffer overflow could not occur was made safer.
Reviewed by: Bruce Evans <bde@zeta.org.au> Reviewed by: Matthew Dillon <dillon@apollo.backplane.com> Reviewed by: Mike Spengler <mks@networkcs.com>
|
41504 |
04-Dec-1998 |
rvb |
Don't print diagnostic anymore
|
41416 |
29-Nov-1998 |
dt |
Honor MNT_NOATIME.
PR: 8383 Submitted by: Carl Mascott <cmascott@world.std.com>
|
41287 |
22-Nov-1998 |
bde |
Return ENOTTY instead of EBADF for ioctls on dead vnodes. This fixes tcsetpgrp() on controlling terminals that are no longer associated with the session of the calling process, not to mention ioctl.2.
|
41275 |
21-Nov-1998 |
dt |
Support NT VFAT lower case flags.
PR: 8383 (Mostly) Submitted by: Carl Mascott <cmascott@world.std.com>
|
41202 |
16-Nov-1998 |
rvb |
A few bug fixes for Robert Watson
|
41173 |
15-Nov-1998 |
bde |
Finished updating module event handlers to be compatible with modeventhand_t.
|
41095 |
11-Nov-1998 |
rvb |
coda_lookup now passes up an extra flag. But old veni will be ok; new veni will check /dev/cfs0 to make sure that a new kernel is running. Also, a bug in vc_nb_close iff CODA_SIGNAL's were seen has been fixed.
|
41059 |
10-Nov-1998 |
peter |
add #include <sys/kernel.h> where it's needed by MALLOC_DEFINE()
|
41031 |
09-Nov-1998 |
peter |
"fix" a warning that has been bugging me for ages. Eliminate a couple of temporary variables since they are only used once and their types were the cause of the warnings.
|
40857 |
03-Nov-1998 |
peter |
Support KLD. We register and unregister two modules. "coda" (the vfs) via VFS_SET(), and "codadev" for the cdevsw entry. From kldstat -v: 3 1 0xf02c5000 115d8 coda.ko Contains modules: Id Name 2 codadev 3 coda
|
40852 |
03-Nov-1998 |
peter |
Change the #ifdef UNION code into a callable hook. Arrange to have this set up when unionfs is present, either statically or as a kld module.
|
40790 |
31-Oct-1998 |
peter |
Use TAILQ macros for clean/dirty block list processing. Set b_xflags rather than abusing the list next pointer with a magic number.
|
40717 |
29-Oct-1998 |
peter |
Use vtruncbuf() rather than vinvalbuf() when shortening files.
|
40708 |
28-Oct-1998 |
rvb |
Change the way unmounting happens to guarantee that the client programs are allowed to finish up (coda_call is forced to complete) and release their locks. Thus there is a reasonable chance that the vflush implicit in the unmount will not get hung on held locks.
|
40706 |
28-Oct-1998 |
rvb |
Venus must be passed O_CREAT flag on VOP_OPEN iff this is a creat so that we can will allow a mode 444 file to be written into. Sync with the latest coda.h and deal with collateral damage.
|
40700 |
28-Oct-1998 |
dg |
Added a second argument, "activate" to the vm_page_unwire() call so that the caller can select either inactive or active queue to put the page on.
|
40660 |
26-Oct-1998 |
bde |
Removed redundant bitrotted checks for major numbers instead of updating them.
|
40651 |
25-Oct-1998 |
bde |
Don't follow null bdevsw pointers. The `major(dev) < nblkdev' test rotted when bdevsw[] became sparse. We still depend on magic to avoid having to check that (v_rdev) device numbers in vnodes are not NODEV.
|
40648 |
25-Oct-1998 |
phk |
Nitpicking and dusting performed on a train. Removes trivial warnings about unused variables, labels and other lint.
|
39778 |
29-Sep-1998 |
rvb |
Fixes for lkm: 1. use VFS_LKM vs ACTUALLY_LKM_NOT_KERNEL 2. don't pass -DCODA to lkm build
|
39728 |
28-Sep-1998 |
rvb |
Cleanup and fix THE bug
|
39651 |
25-Sep-1998 |
rvb |
Don't lose this file
|
39650 |
25-Sep-1998 |
rvb |
Put "stray" printouts under DIAGNOSTIC. Make everything build with DEBUG on. Add support for lkm. (The macro's don't work for me; for a good chuckle look at the end of coda_fbsd.c.)
|
39187 |
14-Sep-1998 |
sos |
Remove the SLICE code. This clearly needs alot more thought, and we dont need this to hunt us down in 3.0-RELEASE.
|
39129 |
13-Sep-1998 |
dt |
Remove unused variable.
Pointed out by: bde
|
39128 |
13-Sep-1998 |
dt |
Fix a bug related to renaming in root directory. This bug reported by Cejka Rudolf <cejkar@dcse.fee.vutbr.cz> on freebsd-current in Messaage-Id <199807141023.MAA09803@kazi.dcse.fee.vutbr.cz>.
Reviewed by: bde
|
39126 |
13-Sep-1998 |
rvb |
Finish conversion of cfs -> coda
|
39111 |
12-Sep-1998 |
phk |
various nits that didn't make it through the brucefilter.
|
39085 |
11-Sep-1998 |
rvb |
All the references to cfs, in symbols, structs, and strings have been changed to coda. (Same for CFS.)
|
38909 |
07-Sep-1998 |
bde |
Removed statically configured mount type numbers (MOUNT_*) and all references to them.
The change a couple of days ago to ignore these numbers in statically configured vfsconf structs was slightly premature because the cd9660, cfs, devfs, ext2fs, nfs vfs's still used MOUNT_* instead of the number in their vfsconf struct.
|
38903 |
07-Sep-1998 |
guido |
Fix problem reported on bugtraq: check permission of device mounted for non-root users. Fortunately, the default for vfs.usermount is 0. Tested by: "Jan B. Koum " <jkb@best.com
|
38884 |
06-Sep-1998 |
rvb |
Clean LINT
|
38862 |
05-Sep-1998 |
phk |
Add a new vnode op, VOP_FREEBLKS(), which filesystems can use to inform device drivers about sectors no longer in use.
Device-drivers receive the call through d_strategy, if they have D_CANFREE in d_flags.
This allows flash based devices to erase the sectors and avoid pointlessly carrying them around in compactions.
Reviewed by: Kirk Mckusick, bde Sponsored by: M-Systems (www.m-sys.com)
|
38799 |
04-Sep-1998 |
dfr |
Cosmetic changes to the PAGE_XXX macros to make them consistent with the other objects in vm.
|
38759 |
02-Sep-1998 |
rvb |
Pass2 complete
|
38625 |
29-Aug-1998 |
rvb |
Very Preliminary Coda
|
38545 |
25-Aug-1998 |
phk |
sort the prototypes
|
38529 |
24-Aug-1998 |
phk |
Last commit managed to get mangled somehow.
|
38525 |
24-Aug-1998 |
phk |
Remove the last remaining evidence of B_TAPE. Reclaim 3 unused bits in b_flags
|
38489 |
23-Aug-1998 |
bde |
Enabled Lite2 fix for reading from dead ttys.
|
38408 |
17-Aug-1998 |
bde |
Removed unused includes.
|
38354 |
16-Aug-1998 |
bde |
Use [u]intptr_t instead of [u_]long for casts between pointers and integers. Don't forget to cast to (void *) as well.
|
37977 |
30-Jul-1998 |
bde |
Fixed printf format errors.
|
37898 |
27-Jul-1998 |
alex |
Style fixes and a bug fix: don't remove the exit handler if unmount fails.
Submitted by: bde
|
37877 |
27-Jul-1998 |
alex |
A better solution to the rm_at_exit problem: Register the exit function during first mount. Unregister the exit function at last unmount.
Concept by: sef Reviewed by: sef Implemented by: alex
|
37864 |
25-Jul-1998 |
alex |
Override the default VFS LKM dispatch functions so that a module unload function can be provided (this is necessary to unregister the at_exit handler).
|
37653 |
15-Jul-1998 |
bde |
Cast pointers to [u]intptr_t instead of to [unsigned] long.
|
37649 |
15-Jul-1998 |
bde |
Cast pointers to uintptr_t/intptr_t instead of to u_long/long, respectively. Most of the longs should probably have been u_longs, but this changes is just to prevent warnings about casts between pointers and integers of different sizes, not to fix poorly chosen types.
|
37555 |
11-Jul-1998 |
bde |
Fixed printf format errors.
|
37465 |
07-Jul-1998 |
bde |
Quick fix for type mismatches which were fatal if longs aren't 32 bits. We used a private, wrong, version of `struct dirent' to help break getdirentries(), and we use a silly check that the size of this struct is a power of 2 to help break mount() if getdirentries() would not work. This fix just changes the struct to match `struct dirent' (except for the name length).
|
37389 |
04-Jul-1998 |
julian |
There is no such thing any more as "struct bdevsw".
There is only cdevsw (which should be renamed in a later edit to deventry or something). cdevsw contains the union of what were in both bdevsw an cdevsw entries. The bdevsw[] table stiff exists and is a second pointer to the cdevsw entry of the device. it's major is in d_bmaj rather than d_maj. some cleanup still to happen (e.g. dsopen now gets two pointers to the same cdevsw struct instead of one to a bdevsw and one to a cdevsw).
rawread()/rawwrite() went away as part of this though it's not strictly the same patch, just that it involves all the same lines in the drivers.
cdroms no longer have write() entries (they did have rawwrite (?)). tapes no longer have support for bdev operations.
Reviewed by: Eivind Eklund and Mike Smith Changes suggested by eivind.
|
37384 |
04-Jul-1998 |
julian |
VOP_STRATEGY grows an (struct vnode *) argument as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>
|
37154 |
25-Jun-1998 |
dt |
Remove "not hungly" panics. Cookies now used by the linux and ibcs2 emulators. The emulators assume that filesystem may just ignore cookies, and handle this case correctly. So we just ignore cookies.
Also sync *_readdir "prototypes" with reality.
|
36969 |
14-Jun-1998 |
bde |
Avoid a 64-bit division in procfs_readdir(). Fixed related overflows. Check args using the same expression as in fdesc and kernfs. The check was actually already correct, modulo overflow. It could be tightened up to either allow huge (aligned) offsets, treating them as EOF, or disallow all offsets beyond EOF.
Didn't fix invalid address calculation &foo[i] where i may be out of bounds.
Didn't fix shooting of foot using a private unportable dirent struct.
|
36963 |
14-Jun-1998 |
bde |
Avoid a 64-bit division in fdesc_readdir(). Fixed related overflows and missing arg checking.
Panic instead of returning bogus error codes or forgetting to check all cases if fdesc_readdir() gets called for a non-directory. This can't happen.
|
36873 |
10-Jun-1998 |
dfr |
Make these files compile.
|
36864 |
10-Jun-1998 |
alex |
ENOPNOTSUPP --> EOPNOTSUPP
PR: 6906 Submitted by: Steven G. Kargl <kargl@troutmask.apl.washington.edu>
|
36858 |
10-Jun-1998 |
dt |
Back out previous change. This behavior is at least completely "susv2"-compliant.
|
36851 |
10-Jun-1998 |
dt |
Also return EOPNOTSUPP rather than EINVAL for not supported owner and group changes.
|
36840 |
10-Jun-1998 |
peter |
Don't silently accept attempts to change flags where they are not supported.
|
36839 |
10-Jun-1998 |
peter |
Return EOPNOTSUPP rather than EINVAL for flags that are not supported.
|
36811 |
09-Jun-1998 |
dt |
Fix typo in a comment.
|
36735 |
07-Jun-1998 |
dfr |
This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change.
The prototype FreeBSD/alpha machdep will follow in a couple of days time.
|
36275 |
21-May-1998 |
dyson |
Make flushing dirty pages work correctly on filesystems that unexpectedly do not complete writes even with sync I/O requests. This should help the behavior of mmaped files when using softupdates (and perhaps in other circumstances also.)
|
36168 |
19-May-1998 |
tegge |
Disallow reading the current kernel stack. Only the user structure and the current registers should be accessible. Reviewed by: David Greenman <dg@root.com>
|
36154 |
18-May-1998 |
dt |
Fix priority bug in previous commit.
Submitted by: bde
|
36133 |
17-May-1998 |
dt |
Fix support for pre-Win95 filesystems: Make it possible to lookup just created short file name. Don't insert "generation numbers".
|
36130 |
17-May-1998 |
dt |
Remove bogus LK_RETRY.
Submitted by: bde
|
36123 |
17-May-1998 |
bde |
Don't forget to clean up after an error reading the directory entry in deget().
|
36122 |
17-May-1998 |
bde |
Removed vestiges of pre-Lite2 locking.
|
36119 |
17-May-1998 |
phk |
s/nanoruntime/nanouptime/g s/microruntime/microuptime/g
Reviewed by: bde
|
36117 |
17-May-1998 |
sos |
Cleanup after Garret, include unpch.h to get at various macros..
|
35871 |
09-May-1998 |
dt |
Fix off by ane error in previous commit.
This caused following commands: mkdir z cd z touch A B mv B A corrupt the '..' entry in 'z'.
Reported by: bde
|
35823 |
07-May-1998 |
msmith |
In the words of the submitter:
--------- Make callers of namei() responsible for releasing references or locks instead of having the underlying filesystems do it. This eliminates redundancy in all terminal filesystems and makes it possible for stacked transport layers such as umapfs or nullfs to operate correctly.
Quality testing was done with testvn, and lat_fs from the lmbench suite.
Some NFS client testing courtesy of Patrik Kudo.
vop_mknod and vop_symlink still release the returned vpp. vop_rename still releases 4 vnode arguments before it returns. These remaining cases will be corrected in the next set of patches. ---------
Submitted by: Michael Hancock <michaelh@cet.co.jp>
|
35769 |
06-May-1998 |
msmith |
As described by the submitter:
Reverse the VFS_VRELE patch. Reference counting of vnodes does not need to be done per-fs. I noticed this while fixing vfs layering violations. Doing reference counting in generic code is also the preference cited by John Heidemann in recent discussions with him.
The implementation of alternative vnode management per-fs is still a valid requirement for some filesystems but will be revisited sometime later, most likely using a different framework.
Submitted by: Michael Hancock <michaelh@cet.co.jp>
|
35511 |
29-Apr-1998 |
dt |
Use DFLTBSIZE instead of MAXBSIZE for pm_fatblksize.
In msdosfs_sync: spelling fix, formatting changes; fix MNT_LAZY (sync modified denodes, don't sync device)
Mostly submitted by (and with hints from): bde
Increase limit for maximum disk size: as far as I can see previous limit was gratuitously too low.
|
35497 |
29-Apr-1998 |
dyson |
Tighten up management of memory and swap space during map allocation, deallocation cycles. This should provide a measurable improvement on swap and memory allocation on loaded systems. It is unlikely a complete solution. Also, provide more map info with procfs. Chuck Cranor spurred on this improvement.
|
35360 |
20-Apr-1998 |
julian |
The 'mountroot' option is obviously pointless for an LKM so allow LKM compilation to succeed by making it go away for that case. Saves needing to include opt_devfs.h which an LKM cannot rely on anyhow.
|
35323 |
20-Apr-1998 |
julian |
Make the devfs SLICE option a standard type option. (hopefully it will go away eventually anyhow)
|
35319 |
19-Apr-1998 |
julian |
Add changes and code to implement a functional DEVFS. This code will be turned on with the TWO options DEVFS and SLICE. (see LINT) Two labels PRE_DEVFS_SLICE and POST_DEVFS_SLICE will deliniate these changes.
/dev will be automatically mounted by init (thanks phk) on bootup. See /sys/dev/slice/slice.4 for more info. All code should act the same without these options enabled.
Mike Smith, Poul Henning Kamp, Soeren, and a few dozen others
This code does not support the following: bad144 handling. Persistance. (My head is still hurting from the last time we discussed this) ATAPI flopies are not handled by the SLICE code yet.
When this code is running, all major numbers are arbitrary and COULD be dynamically assigned. (this is not done, for POLA only) Minor numbers for disk slices ARE arbitray and dynamically assigned.
|
35256 |
17-Apr-1998 |
des |
Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108.
|
35210 |
15-Apr-1998 |
bde |
Support compiling with `gcc -ansi'.
|
35202 |
15-Apr-1998 |
dt |
Add a missing LK_RETRY. Noticed by: Bruce (almost 2 monts ago)
Remove a debugging printf.
|
35063 |
06-Apr-1998 |
phk |
Use random() rather then than homegrown stuff.
|
35046 |
05-Apr-1998 |
ache |
Print explanation diagnostics when mount is impossible Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
|
35029 |
04-Apr-1998 |
phk |
Time changes mark 2:
* Figure out UTC relative to boottime. Four new functions provide time relative to boottime.
* move "runtime" into struct proc. This helps fix the calcru() problem in SMP.
* kill mono_time.
* add timespec{add|sub|cmp} macros to time.h. (XXX: These may change!)
* nanosleep, select & poll takes long sleeps one day at a time
Reviewed by: bde Tested by: ache and others
|
34920 |
28-Mar-1998 |
ache |
Fix dead hang writing to FAT Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
|
34901 |
26-Mar-1998 |
phk |
Add two new functions, get{micro|nano}time.
They are atomic, but return in essence what is in the "time" variable. gettime() is now a macro front for getmicrotime().
Various patches to use the two new functions instead of the various hacks used in their absence.
Some puntuation and grammer patches from Bruce.
A couple of XXX comments.
|
34698 |
20-Mar-1998 |
kato |
Deleted 1024bytes/sector floppy code for PC-98 arch. The 1024bytes/sector code has not worked for long time and it should be re-implemented.
|
34642 |
17-Mar-1998 |
kato |
If lowervp is NULLVP, vap was clobbered.
Submitted by: Naofumi Honda <honda@Kururu.math.sci.hokudai.ac.jp> Obtained from: NetBSD/pc98
|
34266 |
08-Mar-1998 |
julian |
Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman) Submitted by: Kirk McKusick (mcKusick@mckusick.com) Obtained from: WHistle development tree
|
34249 |
08-Mar-1998 |
dyson |
Initialize b_resid, and also print out better diagnostics on I/O errors. This will allow for better tracking of user error reports.
|
34206 |
07-Mar-1998 |
dyson |
This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated.
1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.
|
34096 |
06-Mar-1998 |
msmith |
Trivial filesystem getpages/putpages implementations, set the second. These should be considered the first steps in a work-in-progress. Submitted by: Terry Lambert <terry@freebsd.org>
|
34023 |
04-Mar-1998 |
dyson |
Fix certain kinds of block device operations. For example, tunefs on a block device shouldn't crash the system anymore.
|
34002 |
03-Mar-1998 |
msmith |
Patch to the last commit; attempt to unspam stuff from NetBSD. Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
|
33964 |
01-Mar-1998 |
msmith |
The intent is to get rid of WILLRELE in vnode_if.src by making a complement to all ops that return a vpp, VFS_VRELE. This is initially only for file systems that implement the following ops that do a WILLRELE:
vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link, vop_rename, vop_mkdir, vop_rmdir, vop_symlink
This is initial DNA that doesn't do anything yet. VFS_VRELE is implemented but not called.
A default vfs_vrele was created for fs implementations that use the standard vnode management routines.
VFS_VRELE implementations were made for the following file systems:
Standard (vfs_vrele) ffs mfs nfs msdosfs devfs ext2fs
Custom union umapfs
Just EOPNOTSUPP fdesc procfs kernfs portal cd9660
These implementations may change as VOP changes are implemented.
In the next phase, in the vop implementations calls to vrele and the vrele part of vput will be moved to the top layer vfs_vnops and made visible to all layers. vput will be replaced by unlock in these cases. Unlocking will still be done in the per fs layer but the refcount decrement will be triggered at the top because it doesn't hurt to hold a vnode reference a little longer. This will have minimal impact on the structure of the existing code.
This will only be done for vnode arguments that are released by the various fs vop implementations.
Wider use of VFS_VRELE will likely require restructuring of the code.
Reviewed by: phk, dyson, terry et. al. Submitted by: Michael Hancock <michaelh@cet.co.jp>
|
33959 |
01-Mar-1998 |
msmith |
Fix mmap() on msdosfs. In the words of the submitter:
|In the process of evaluating the getpages/putpages issues I discovered |that mmap on MSDOSFS does not work. This is because I blindly merged |NetBSD changes in msdosfs_bmap and msdosfs_strategy. Apparently, their |blocksize is always DEV_BSIZE (even in files), while in FreeBSD |blocksize in files is v_mount->mnt_stat.f_iosize (i.e. clustersize in |MSDOSFS case). The patch is below.
Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
|
33872 |
27-Feb-1998 |
msmith |
Fix a problem with the conversion of Unix filenames into the VFAT namespace. Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
|
33848 |
26-Feb-1998 |
msmith |
Fixes for some bugs in the VFAT/FAT32 support:
- 'mv longnamedfile1 longnamedfile2' would cause longnamedfile2 to lose its long name. - Long names have trailing spaces/dots stripped for lookup as well as assignment. - A lockup when the mdsosfs was accessed from within the Linux emulator is fixed. - A bug whereby long filenames were recognised by Microsoft operating systems but not FreeBSD is fixed.
Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
|
33844 |
26-Feb-1998 |
kato |
Deleted KLOCK-hack.
|
33791 |
24-Feb-1998 |
ache |
Back out "always view in lowercase" part Return to previous variant "comparing in lowercase" in winChkName
|
33768 |
23-Feb-1998 |
ache |
Implement loadable DOS<->local conversion tables for DOS names Always create DOS name in uppercase Always view DOS name in lowercase
|
33765 |
23-Feb-1998 |
kato |
Fix signatures of NEC's DOS formats.
Submitted by: Takahashi Yoshihiro <nyan@wyvern.cc.kogakuin.ac.jp>
|
33762 |
23-Feb-1998 |
ache |
Oops, add missing bcopy of upper->lower table
|
33760 |
23-Feb-1998 |
ache |
Implement loadable upper->lower local conversion table
|
33751 |
22-Feb-1998 |
ache |
Reduce new arguments number added in my changes
|
33750 |
22-Feb-1998 |
ache |
Add Unicode support to winChkName, now lookup works!
|
33747 |
22-Feb-1998 |
ache |
Implement loadable local<->unicode file names conversion Note: it produce correct names only for Win95, DOS names are still incorrect and need similar work mount_msdos support coming soon
|
33745 |
22-Feb-1998 |
ache |
Replace all unknown Unicode characters with '?' in win->unix mapping
|
33744 |
22-Feb-1998 |
ache |
Add initial support to map 0x4XX Unicode Cyrillic range names: only win->unix part is implemented at this time with 256-byte table defaulted to KOI8-R (will be loadable in future). Since back mapping not supported yet, you'll get "No such file or directory" on each Cyrillic name with 'ls -l', only 'echo *' work at this moment. Teach current code to understand Unicode a bit.
|
33676 |
20-Feb-1998 |
bde |
Removed unused #includes.
|
33548 |
18-Feb-1998 |
jkh |
Update MSDOSFS code using NetBSD's msdosfs as a guide to support FAT32 partitions. Unfortunately, we looked around here at Walnut Creek CDROM for any newer FAT32-supporting versions of Win95 and we were unsuccessful; only the older stuff here. So this is untested beyond simply making sure it compiles and someone with access to an actual FAT32 fs will have to let us know how well it actually works. Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru> Obtained from: NetBSD
|
33215 |
10-Feb-1998 |
kato |
Deleted unused variable.
|
33211 |
10-Feb-1998 |
kato |
Undo UN_KLOCK hack except union_allocvp(). Now, vput() doesn't lock the vnode.
|
33181 |
09-Feb-1998 |
eivind |
Staticize.
|
33146 |
07-Feb-1998 |
kato |
Fixed pagefault when cred == NOCRED.
PR: 5632
|
33145 |
07-Feb-1998 |
kato |
Fixed number of entries in gid-mapfile.
PR: 5640
|
33134 |
06-Feb-1998 |
eivind |
Back out DIAGNOSTIC changes.
|
33129 |
06-Feb-1998 |
kato |
Workarround for DIAGNOSTIC kernel's panic in union_lookup(). Union_removed_upper() clobbers cache when file is removed. Upper vp will be removed by union_reclaim().
|
33109 |
05-Feb-1998 |
dyson |
1) Start using a cleaner and more consistant page allocator instead of the various ad-hoc schemes. 2) When bringing in UPAGES, the pmap code needs to do another vm_page_lookup. 3) When appropriate, set the PG_A or PG_M bits a-priori to both avoid some processor errata, and to minimize redundant processor updating of page tables. 4) Modify pmap_protect so that it can only remove permissions (as it originally supported.) The additional capability is not needed. 5) Streamline read-only to read-write page mappings. 6) For pmap_copy_page, don't enable write mapping for source page. 7) Correct and clean-up pmap_incore. 8) Cluster initial kern_exec pagin. 9) Removal of some minor lint from kern_malloc. 10) Correct some ioopt code. 11) Remove some dead code from the MI swapout routine. 12) Correct vm_object_deallocate (to remove backing_object ref.) 13) Fix dead object handling, that had problems under heavy memory load. 14) Add minor vm_page_lookup improvements. 15) Some pages are not in objects, and make sure that the vm_page.c can properly support such pages. 16) Add some more page deficit handling. 17) Some minor code readability improvements.
|
33108 |
04-Feb-1998 |
eivind |
Turn DIAGNOSTIC into a new-style option.
|
33054 |
03-Feb-1998 |
bde |
Forward declare some structs so that this file is more self-sufficient.
|
33052 |
03-Feb-1998 |
bde |
Forward declare some structs so that this file is more self-sufficient.
Don't declare kernel objects or functions unless KERNEL is defined.
|
33037 |
03-Feb-1998 |
kato |
Declare the variable `i' when UMAP_DIAGNOSTIC is defined.
|
32929 |
31-Jan-1998 |
eivind |
Make the debug options new-style.
This also zaps a DPT option from lint; it wasn't referenced from anywhere.
|
32760 |
25-Jan-1998 |
kato |
Fixed typo in comment.
|
32702 |
22-Jan-1998 |
dyson |
VM level code cleanups.
1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM.
This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.)
This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)
|
32689 |
22-Jan-1998 |
kato |
Delete unused code in union_fsync().
|
32642 |
20-Jan-1998 |
kato |
- Move SETKLOC and CLEARKLOCK macros into uion.h. - Set UN_ULOCK in union_lock() when UN_KLOCK is set. Caller expects that vnode is locked correctly, and may call another function which expects locked vnode and may unlock the vnode. - Do not assume the behavior of inside functions in FreeBSD's vfs_suber.c is same as 4.4BSD-Lite2. Vnode may be locked in vget() even though flag is zero. (Locked vnode is, of course, unlocked before returning from vget.)
|
32599 |
18-Jan-1998 |
kato |
Workarround for locking violation while recycling vnode which union fs used in freelist.
|
32598 |
18-Jan-1998 |
kato |
Improve and revise fixes for locking violation.
Obtained from: NetBSD/pc98
|
32286 |
06-Jan-1998 |
dyson |
Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does.
When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex.
When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore.
A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes.
Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.
|
32285 |
06-Jan-1998 |
sef |
Use CHECKIO in procfs_ioctl() to ensure that any changes in UID/GID result in the expected failure.
|
32150 |
01-Jan-1998 |
bde |
Fixed missing initialization of mp->mnt_stat. At least vm depends on at least mp->mnt_stat.f_iosize being nonzero.
PR: 5212
|
32120 |
30-Dec-1997 |
bde |
Fixed a missing/misplaced/misstyled prototype.
|
32071 |
29-Dec-1997 |
dyson |
Lots of improvements, including restructring the caching and management of vnodes and objects. There are some metadata performance improvements that come along with this. There are also a few prototypes added when the need is noticed. Changes include:
1) Cleaning up vref, vget. 2) Removal of the object cache. 3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore. 4) Correct some missing LK_RETRY's in vn_lock. 5) Correct the page range in the code for msync.
Be gentle, and please give me feedback asap.
|
32011 |
27-Dec-1997 |
bde |
Unspammed nested include of <vm/vm_zone.h>.
|
31929 |
21-Dec-1997 |
joerg |
Properly clean out the SI_MOUNTEDON flag iff the mount attempt fails half the way down. Otherwise, further attempts to mount the device will be rejected with BUSY.
IMHO, this flag can completely go away for cd9660. There's no reason you need to prevent CDs from being mounted multiple times, and in case of multisession CDs it can even make sense to mount two different sessions by the same time (to different mount points, otherwise it would be pointless ;).
|
31891 |
20-Dec-1997 |
sef |
Clear the p_stops field on change of user/group id, unless the correct flag is set in the p_pfsflags field. This, essentially, prevents an SUID proram from hanging after being traced. (E.g., "truss /usr/bin/rlogin" would fail, but leave rlogin in a stopevent state.) Yet another case where procctl is (hopefully ;)) no longer needed in the general case.
Reviewed by: bde (thanks bruce :))
|
31860 |
19-Dec-1997 |
bde |
Set the sender's low watermark to match the maximum size for atomic writes that we advertise (PIPE_BUF = 512).
|
31727 |
15-Dec-1997 |
wollman |
Add support for poll(2) on files. vop_nopoll() now returns POLLNVAL if one of the new poll types is requested; hopefully this will not break any existing code. (This is done so that programs have a dependable way of determining whether a filesystem supports the extended poll types or not.)
The new poll types added are:
POLLWRITE - file contents may have been modified POLLNLINK - file was linked, unlinked, or renamed POLLATTRIB - file's attributes may have been changed POLLEXTEND - file was extended
Note that the internal operation of poll() means that it is impossible for two processes to reliably poll for the same event (this could be fixed but may not be worth it), so it is not possible to rewrite `tail -f' to use poll at this time.
|
31701 |
13-Dec-1997 |
bde |
Fixed EOF handing.
1. SS_CANTRCVMORE was initially set on the wrong socket, so reads when there has never been a writer on the socket did not return 0. Note that such reads are only possible if the fifo was opened in (O_RDONLY | O_NONBLOCK) mode.
2. SS_CANTSENDMORE was initially set on the wrong socket, but this was harmless because the wrong socket is never sent from and there is no need to set the flag initially on the right socket (since open in (O_WRONLY | O_NONBLOCK) mode fails if there is no reader...).
3. SS_CANTRCVMORE was cleared when read() returns. This broke the case where read() returns 0 - subsequent reads are supposed to return 0 until a writer appears. There is no need to clear the flag when read() returns, since it is cleared correctly when a writer appears.
|
31700 |
13-Dec-1997 |
bde |
Restored fifo_pathconf() from rev.1.32. vop_stdpathconf() is too general to be of much use. Using it here weakened the _PC_MAX_CANON, _PC_MAX_INPUT and _PC_VDISABLE cases.
fifo_pathconf() is not quite correct either. _PC_CHOWN_RESTRICTED and _PC_LINK_MAX should be handled by the host file system. For directories, the host file system should let us handle _PC_PIPE_BUF.
|
31691 |
13-Dec-1997 |
sef |
Change the ioctls for procfs around a bit; in particular, whever possible, change from
ioctl(fd, PIOC<foo>, &i);
to
ioctl(fd, PIOC<foo>, i);
This is going from the _IOW to _IO ioctl macro. The kernel, procctl, and truss must be in synch for it all to work (not doing so will get errors about inappropriate ioctl's, fortunately). Hopefully I didn't forget anything :).
|
31674 |
12-Dec-1997 |
sef |
Fix a problem with procfs_exit() that resulted in missing some procfs nodes; this also apparantly caused a panic in some circumstances. Also, since procfs_exit() is getting rid of the nodes when a process exits, don't bother checking for the process' existance in procfs_inactive().
|
31640 |
09-Dec-1997 |
sef |
Code to prevent a panic caused by procfs_exit(). Note that i don't know what is teh root cause -- but, sometimes, a procfs vnode in pfshead is apparantly corrupt (or a UFS vnode instead). Without this patch, I can get it to panic by doing (in csh)
while (1) ps auxwww end
and it will panic when the PID's wrap. With it, it does not panic. Yes -- I know that this is NOT the right way to fix it. But I haven't been able to get it to panic yet (which confuses me). I am going to be looking into the vgone() code now, as that may be a part of it.
|
31636 |
08-Dec-1997 |
sef |
A couple of fixes from bruce: first of all, psignal is a void (stupid me; unfortunately, also makes it hard ot check for errors); second, I had managed to forget a change to PIOCSFL (it should be _IOW, not _IOR) I had in my local copy, and Bruce called me on it.
Submitted by: bde
|
31618 |
08-Dec-1997 |
sef |
Use at_exit() to invoke procfs_exit() instead of calling it directly. Note that an unload facility should be used to call rm_at_exit() (if procfs is being loaded as an LKM and is subsequently removed), but it was non-obvious how to do this in the VFS framework.
Reviewed by: Julian Elischer
|
31595 |
07-Dec-1997 |
sef |
Clear the stop events and wakeup the process on teh last close of the procfs/mem file. While this doesn't prevent an unkillable process, it means that a broken truss prorgam won't do it accidently now (well, there's a small window of opportunity). Note that this requires the change to truss I am about to commit.
|
31564 |
06-Dec-1997 |
sef |
Changes to allow event-based process monitoring and control.
|
31561 |
05-Dec-1997 |
bde |
Don't include <sys/lock.h> in headers when only `struct simplelock' is required. Fixed everything that depended on the pollution.
|
31273 |
18-Nov-1997 |
phk |
Staticize.
|
31271 |
18-Nov-1997 |
phk |
Staticize a few things.
|
31174 |
14-Nov-1997 |
tegge |
Don't try to obtain an excluive lock on the vm map, since a deadlock might occur if the process owning the map is wiring pages.
|
31132 |
12-Nov-1997 |
julian |
Reviewed by: various.
Ever since I first say the way the mount flags were used I've hated the fact that modes, and events, internal and exported, and short-term and long term flags are all thrown together. Finally it's annoyed me enough.. This patch to the entire FreeBSD tree adds a second mount flag word to the mount struct. it is not exported to userspace. I have moved some of the non exported flags over to this word. this means that we now have 8 free bits in the mount flags. There are another two that might well move over, but which I'm not sure about. The only user visible change would have been in pstat -v, except that davidg has disabled it anyhow. I'd still like to move the state flags and the 'command' flags apart from each other.. e.g. MNT_FORCE really doesn't have the same semantics as MNT_RDONLY, but that's left for another day.
|
31016 |
07-Nov-1997 |
phk |
Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by: -Wunused
|
30994 |
06-Nov-1997 |
phk |
Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead.
This fixes a boatload of compiler warning, and removes a lot of cruft from the sources.
I have not removed the /*ARGSUSED*/, they will require some looking at.
libkvm, ps and other userland struct proc frobbing programs will need recompiled.
|
30785 |
27-Oct-1997 |
bde |
KNFize rev.1.31.
|
30784 |
27-Oct-1997 |
bde |
Use unique sleep message strings.
|
30782 |
27-Oct-1997 |
bde |
Use bread() instead of cluster_read() for reading the last block in a file. There was a (harmless, I think) off-by-1 error. This was fixed in ufs long ago (rev.1.21 of ufs_readwrite.c) but not in cd9660.
cd9660_read() has stagnated in many other ways. It is closer to the Net/2 ufs_read() (which is was cloned from) than ufs_read() itself is.
|
30780 |
27-Oct-1997 |
bde |
Removed unused #includes. The need for most of them went away with recent changes (docluster* and vfs improvements).
|
30743 |
26-Oct-1997 |
phk |
VFS interior redecoration.
Rename vn_default_error to vop_defaultop all over the place. Move vn_bwrite from vfs_bio.c to vfs_default.c and call it vop_stdbwrite. Use vop_null instead of nullop. Move vop_nopoll from vfs_subr.c to vfs_default.c Move vop_sharedlock from vfs_subr.c to vfs_default.c Move vop_nolock from vfs_subr.c to vfs_default.c Move vop_nounlock from vfs_subr.c to vfs_default.c Move vop_noislocked from vfs_subr.c to vfs_default.c Use vop_ebadf instead of *_ebadf. Add vop_defaultop for getpages on master vnode in MFS.
|
30637 |
21-Oct-1997 |
roberto |
Fix the same leak as in nullfs. Now the lowervp is properly marked inactive.
Reviewed by: phk
|
30636 |
21-Oct-1997 |
roberto |
Fix the file leak bug. The lower layer wasn't informed the vnode was inactive and kept a reference, preventing the blocks to be reclaimed.
Changed the comment in null_inactive to reflect the current situation.
Reviewed by: phk
|
30513 |
17-Oct-1997 |
phk |
Make a set of VOP standard lock, unlock & islocked VOP operators, which depend on the lock being located at vp->v_data. Saves 3x3 identical vop procs, more as the other filesystems becomes lock aware.
|
30496 |
16-Oct-1997 |
phk |
VFS clean up "hekto commit"
1. Add defaults for more VOPs VOP_LOCK vop_nolock VOP_ISLOCKED vop_noislocked VOP_UNLOCK vop_nounlock and remove direct reference in filesystems.
2. Rename the nfsv2 vnop tables to improve sorting order.
|
30492 |
16-Oct-1997 |
phk |
Another VFS cleanup "kilo commit"
1. Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS} intereface function, and now lives in the ufsmount structure.
2. Remove VOP_SEEK, it was unused.
3. Add mode default vops:
VOP_ADVLOCK vop_einval VOP_CLOSE vop_null VOP_FSYNC vop_null VOP_IOCTL vop_enotty VOP_MMAP vop_einval VOP_OPEN vop_null VOP_PATHCONF vop_einval VOP_READLINK vop_einval VOP_REALLOCBLKS vop_eopnotsupp
And remove identical functionality from filesystems
4. Add vop_stdpathconf, which returns the canonical stuff. Use it in the filesystems. (XXX: It's probably wrong that specfs and fifofs sets this vop, shouldn't it come from the "host" filesystem, for instance ufs or cd9660 ?)
5. Try to make system wide VOP functions have vop_* names.
6. Initialize the um_* vectors in LFS.
(Recompile your LKMS!!!)
|
30474 |
16-Oct-1997 |
phk |
VFS mega cleanup commit (x/N)
1. Add new file "sys/kern/vfs_default.c" where default actions for VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE, POLL, REVOKE and STRATEGY. Various stuff spread over the entire tree belongs here.
2. Change VOP_BLKATOFF to a normal function in cd9660.
3. Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC. These are private interface functions between UFS and the underlying storage manager layer (FFS/LFS/MFS/EXT2FS). The functions now live in struct ufsmount instead.
4. Remove a kludge of VOP_ functions in all filesystems, that did nothing but obscure the simplicity and break the expandability. If a filesystem doesn't implement VOP_FOO, it shouldn't have an entry for it in its vnops table. The system will try to DTRT if it is not implemented. There are still some cruft left, but the bulk of it is done.
5. Fix another VCALL in vfs_cache.c (thanks Bruce!)
|
30439 |
15-Oct-1997 |
phk |
vnops megacommit
1. Use the default function to access all the specfs operations. 2. Use the default function to access all the fifofs operations. 3. Use the default function to access all the ufs operations. 4. Fix VCALL usage in vfs_cache.c 5. Use VOCALL to access specfs functions in devfs_vnops.c 6. Staticize most of the spec and fifofs vnops functions. 7. Make UFS panic if it lacks bits of the underlying storage handling.
|
30434 |
15-Oct-1997 |
phk |
Hmm, realign the vnops into two columns.
|
30431 |
15-Oct-1997 |
phk |
Stylistic overhaul of vnops tables. 1. Remove comment stating the blatantly obvious. 2. Align in two columns. 3. Sort all but the default element alphabetically. 4. Remove XXX comments pointing out entries not needed.
|
30354 |
12-Oct-1997 |
phk |
Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them.
A couple of finer points by: bde
|
30309 |
11-Oct-1997 |
phk |
Distribute and statizice a lot of the malloc M_* types.
Substantial input from: bde
|
29888 |
27-Sep-1997 |
kato |
Clustered read and write are switched at mount-option level.
1. Clustered I/O is switched by the MNT_NOCLUSTERR and MNT_NOCLUSTERW bits of the mnt_flag. The sysctl variables, vfs.foo.doclusterread and vfs.foo.doclusterwrite are deleted. Only mount option can control clustered I/O from userland. 2. When foofs_mount mounts block device, foofs_mount checks D_CLUSTERR and D_CLUSTERW bits of the d_flags member in the block device switch table. If D_NOCLUSTERR / D_NOCLUSTERW are set, MNT_NOCLUSTERR / MNT_NOCLUSTERW bits will be set. In this case, MNT_NOCLUSTERR and MNT_NOCLUSTERW cannot be cleared from userland. 3. Vnode driver disables both clustered read and write. 4. Union filesystem disables clutered write.
Reviewed by: bde
|
29653 |
21-Sep-1997 |
dyson |
Change the M_NAMEI allocations to use the zone allocator. This change plus the previous changes to use the zone allocator decrease the useage of malloc by half. The Zone allocator will be upgradeable to be able to use per CPU-pools, and has more intelligent usage of SPLs. Additionally, it has reasonable stats gathering capabilities, while making most calls inline.
|
29584 |
18-Sep-1997 |
phk |
Executing binaries on a nullfs (or nullfs-based) filesystem results in a trap. PR: 3104 Reviewed by: phk Submitted by: Dan Walters hannibal@cyberstation.net
|
29362 |
14-Sep-1997 |
peter |
Convert select -> poll. Delete 'always succeed' select/poll handlers, replaced with generic call. Flag missing vnode op table entries.
|
29286 |
10-Sep-1997 |
phk |
Fix a type in a comment and remove some checks now done centrally.
|
29285 |
10-Sep-1997 |
phk |
This stuff is now done centrally.
|
29208 |
07-Sep-1997 |
bde |
Removed yet more vestiges of config-time swap configuration and/or cleaned up nearby cruft.
|
29180 |
07-Sep-1997 |
bde |
Staticized.
|
29179 |
07-Sep-1997 |
bde |
Some staticized variables were still declared to be extern.
|
29084 |
04-Sep-1997 |
kato |
Support read-only mount.
|
29041 |
02-Sep-1997 |
bde |
Removed unused #includes.
|
28844 |
28-Aug-1997 |
kato |
Include "opt_ddb.h" only when NULLFS_DIAGNOSTIC is defined.
|
28832 |
27-Aug-1997 |
kato |
Fixed NULLFS_DIAGNOSTIC stuff.
|
28787 |
26-Aug-1997 |
phk |
Uncut&paste cache_lookup().
This unifies several times in theory indentical 50 lines of code.
The filesystems have a new method: vop_cachedlookup, which is the meat of the lookup, and use vfs_cache_lookup() for their vop_lookup method. vfs_cache_lookup() will check the namecache and pass on to the vop_cachedlookup method in case of a miss.
It's still the task of the individual filesystems to populate the namecache with cache_enter().
Filesystems that do not use the namecache will just provide the vop_lookup method as usual.
|
28774 |
26-Aug-1997 |
dyson |
Back out some incorrect changes that was worse than the original bug.
|
28716 |
25-Aug-1997 |
kato |
Added a sysctl arg, vfs.cd9660.doclusterread. Deleted debug and !FreeBSD code arround cluster read stuff.
|
28558 |
22-Aug-1997 |
dyson |
This is a trial improvement for the vnode reference count while on the vnode free list problem. Also, the vnode age flag is no longer used by the vnode pager. (It is actually incorrect to use then.) Constructive feedback welcome -- just be kind.
|
28270 |
16-Aug-1997 |
wollman |
Fix all areas of the system (or at least all those in LINT) to avoid storing socket addresses in mbufs. (Socket buffers are the one exception.) A number of kernel APIs needed to get fixed in order to make this happen. Also, fix three protocol families which kept PCBs in mbufs to not malloc them instead. Delete some old compatibility cruft while we're at it, and add some new routines in the in_cksum family.
|
28233 |
15-Aug-1997 |
kato |
Added DIAGNOSTIC routine to test inconsistency of vnode when cnp points `.'.
Obtained from: NetBSD
|
28232 |
15-Aug-1997 |
kato |
Deleted unused code which adjust UN_UNLOCK flag.
|
28189 |
14-Aug-1997 |
kato |
If the user doesn't have read permission, union_copyup should not copy a file to upper layer.
Reviewed by: Naofumi Honda <honda@Kururu.math.sci.hokudai.ac.jp>
|
28188 |
14-Aug-1997 |
kato |
Backed out part of previous change. The example of -b mount in manpage works again.
|
28101 |
12-Aug-1997 |
kato |
Fixed vnode corruption by undefined case in union_lookup(). When uerror == 0 && lerror == EACCES, lowervp == NULLVP and union_allocvp doesn't find existing union node and new union node is created.
Sicne it is dificult to cover all the case, union_lookup always returns when union_lookup1() returns EACCES.
Submitted by: Naofumi Honda <honda@Kururu.math.sci.hokudai.ac.jp> Obtained from: NetBSD/pc98
|
28089 |
12-Aug-1997 |
sef |
Check permissions for fp regs as well as normal regs.
|
28086 |
12-Aug-1997 |
sef |
Fix procfs security hole -- check permissions on meaningful I/Os (namely, reading/writing of mem and regs). Also have to check for the requesting process being group KMEM -- this is a bit of a hack, but ps et al need it.
Reviewed by: davidg
|
27845 |
02-Aug-1997 |
bde |
Removed unused #includes.
|
26964 |
26-Jun-1997 |
alex |
More comment cleanup.
|
26963 |
26-Jun-1997 |
alex |
Typo police.
|
26962 |
26-Jun-1997 |
alex |
Style fix my previous commit.
|
26769 |
21-Jun-1997 |
alex |
Block all write operations to /proc/1/* when securelevel > 0. The additional check in procfs_ctl.c could be backed out, but I'm leaving it in for good measure.
Reviewed by: Theo de Raadt <deraadt@OpenBSD.org>
|
26271 |
29-May-1997 |
tegge |
Don't remove the controlling tty from the session if the vnode is being cleaned. This should help for PR kern/3581.
|
26111 |
25-May-1997 |
peter |
Fix some warnings (missing prototypes, wrong "generic" args etc) umapfs uses one of nullfs's functions...
|
25877 |
17-May-1997 |
phk |
Remove redundant check for vp == dvp (done in VFS before calling).
|
25535 |
07-May-1997 |
kato |
1. Added cast and parenthesis in block size calculaion in union_statfs(). 2. staticized union vops.
Submitted by: Doug Rabson <dfr@nlsystems.com>
|
25531 |
07-May-1997 |
joerg |
Hide the kernel-only stuff inside #ifdef KERNEL. XXX should be #ifdef _KERNEL XXX^2 the !KERNEL part should probably be moved out into a publically visible header file anyway.
|
25461 |
04-May-1997 |
joerg |
Oops. The function cd9660_mountroot() is gone, but i've committed an even more bogus prototype for it in my previous commit.
|
25460 |
04-May-1997 |
joerg |
This mega-commit brings the following:
. It makes cd9660 root f/s working again. . It makes CD9660 a new-style option. . It adds support to mount an ISO9660 multi-session CD-ROM as the root filesystem (the last session actually, but that's what is expected behaviour).
Sigh. The CDIOREADTOCENTRYS did a copyout() of its own, and thus has been unusable for me for this work. Too bad it didn't simply stuff the max 100 entries into the struct ioc_read_toc_entry, but relied on a user supplied data buffer instead. :-( I now had to reinvent the wheel, and created a CDIOREADTOCENTRY ioctl command that can be used in a kernel context.
While doing this, i noticed the following bogosities in existing CD-ROM drivers:
wcd: This driver is likely to be totally bogus when someone tries two succeeding CDIOREADTOCENTRYS (or now CDIOREADTOCENTRY) commands with requesting MSF format, since it apparently operates on an internal table.
scd: This driver apparently returns just a single TOC entry only for the CDIOREADTOCENTRYS command.
I have only been able to test the CDIOREADTOCENTRY command with the cd(4) driver. I hereby request the respective maintainers of the other CD-ROM drivers to verify my code for their driver. When it comes to merging this CD-ROM multisession stuff into RELENG_2_2 i will only consider drivers where i've got a confirmation that it actually works.
|
25397 |
03-May-1997 |
kato |
Fixed panic message in union_lock(): union_link --> union_lock.
|
25379 |
02-May-1997 |
kato |
Access correct union mount point in union_access. Old vnode is saved in savedvp variable and it is used for the argument of MOUNTTOUNIONMOUNT(). I didn't realize ap->a_vp is modified before MOUNTTOUNIONMOUNT(), so the change by revision 1.22 is incorrect.
|
25358 |
01-May-1997 |
sos |
Remove the dependancy on DEV_BSIZE, now specfs works on != 512byte sector devices given that the fs uses a blocksize of at least a physical sector size.
|
25287 |
29-Apr-1997 |
joerg |
For multi-session CD-ROMs, we have to account for previous sessions as well in volume_space_size. Otherwise, NFS exports won't work.
|
25285 |
29-Apr-1997 |
joerg |
Add support for ISO9660 multi-session CD-ROMs. This is just nothing but searching the directory on something else than the default location.
NB: this comprises an interface change to the mount_cd9660(8) utility (commit will follow). You need to rebuild both.
I've got similar patches for RELENG_2_2, should i commit them too?
|
25261 |
29-Apr-1997 |
kato |
Revised fix for locking violation when unionfs calls vput with UN_KLOCK flag.
When UN_KLOCK is set, VOP_UNLOCK should keep uppervp locked and clear UN_ULOCK flag. To do this, when UN_KLOCK is set, (1) union_unlock clears UN_ULOCK and does not clear UN_KLOCK, (2) union_lock() does not access uppervp and does not clear UN_KLOCK, and (3) callers of vput/VOP_UNLOCK should clear UN_KLOCK. For example, vput becomes:
SETKLOCK(union_node); vput(vnode); CLEARKLOCK(union_node);
where SETKLOCK macro sets UN_KLOCK and CLEARKLOCK macro clears UN_KLOCK.
|
25207 |
27-Apr-1997 |
alex |
Removed bogon from previous commit: doubly included sys/systm.h.
|
25200 |
27-Apr-1997 |
alex |
Prevent debugger attachment to init when securelevel > 0.
Noticed by: Brian Buchanan <brian@wasteland.calbbs.com>
|
25192 |
27-Apr-1997 |
kato |
Undo 1.29.
|
25167 |
26-Apr-1997 |
kato |
Do nothing instead of adjusting un_flags when (uppervp is locked) && (UN_ULOCK is not set) in union_lock. This condition may indicate race. DIAGNOSTIC kernel still panic here.
|
25160 |
26-Apr-1997 |
kato |
Do not clear UN_ULOCK in certain case.
Our vput calls vm_object_deallocate() --> vm_object_terminate(). The vm_object_terminate() calls vn_lock(), since UN_LOCKED has been already cleared in union_unlock(). Then, union_lock locks upper vnode when UN_ULOCK is not set. The upper vnode is not unlocked when UN_KLOCK is set in union_unlock(), thus, union_lock tries to lock locked vnode and we get panic.
|
25079 |
21-Apr-1997 |
kato |
Dirty change in union_lock(). Sometimes upper vnode is locked without UN_ULOCK flag. This shows a locking violation but I couldn't find the reason UN_ULOCK is not set or upper vnode is not unlocked. I added the code that detect this case and adjust un_flags. DIAGNOSTIC kernel doesn't adjust un_flags, but just panic here to help debug by kernel hackers.
|
25070 |
21-Apr-1997 |
kato |
Replace VOP_LOCK with vn_lock.
|
25055 |
20-Apr-1997 |
dyson |
Fix both a problem with accessing backing objects, and also release the process map on nonexistant pages. PR: kern/3327 Submitted by: Tor Egge <Tor.Egge@idi.ntnu.no>
|
25016 |
19-Apr-1997 |
kato |
Avoid `lock against myself' panic by following operation:
# mount -t union (or null) dir1 dir2 # mount -t union (or null) dir2 dir1
The function namei in union_mount calls union_root. The upper vnode has been already locked and vn_lock in union_root causes above panic.
Add printf's included in `#ifdef DIAGNOSTIC' for EDEADLK cases.
|
24988 |
17-Apr-1997 |
kato |
Fix `locking against myself' panic by multi nullfs mount of same directory pair.
|
24987 |
17-Apr-1997 |
kato |
Use NULLVP instead of NULL.
|
24985 |
16-Apr-1997 |
kato |
Do not set the uppervp to NULLVP in union_removed_upper. If lowervp is NULLVP, union node will have neither uppervp nor lowervp. This causes page fault trap.
The union_removed_upper just remove union node from cache and it doesn't set uppervp to NULLVP. Since union node is removed from cache, it will not be referenced.
The code that remove union node from cache was copied from union_inactive.
|
24974 |
16-Apr-1997 |
kato |
Undo previous commit to avoid panic, and fix order of argument of VOP_LINK(). The reason of strange behavior was wrong order of the argument, that is, the operation
# ln foo bar
in a union fs tried to do
# ln bar foo
in ufs layer.
Now we can make a link in a union fs.
|
24963 |
15-Apr-1997 |
kato |
Quick-hack to avoid `lock against myself' panic. It is not the real fix!
The ufs_link() assumes that vnode is not unlocked and tries to lock it in certain case. Because union_link calls VOP_LINK after locking vnode, vn_lock in ufs_link causes above panic.
Currently, I don't know the real fix for a locking violation in union_link, but I think it is important to avoid panic.
A vnode is unlocked before calling VOP_LINK and is locked after it if the vnode is not union fs. Even though panic went away, the process that access the union fs in which link was made will hang-up.
Hang-up can be easily reproduced by following operation:
mount -t union a b cd b ln foo bar ls
|
24948 |
15-Apr-1997 |
bde |
Removed more traces of ISODEVMAP.
|
24934 |
14-Apr-1997 |
phk |
Remove all traces of undocumented feature ISODEVMAP.
|
24921 |
14-Apr-1997 |
kato |
Fix `lockmgr: locking against myself' panic by multi union mount of same directory pair.
If we do: mount -t union a b mount -t union a b then, (1) namei tries to lock fs which has been already locked by first union mount and (2) union_root() tries to lock locked fs. To avoid first deadlock condition, unlock vnode if lowerrootvp is union node, and to avoid second case, union_mount returns EDEADLK when multi union mount is detected.
|
24918 |
14-Apr-1997 |
kato |
Fix locking violation when accessing `..'. Obtained from: NetBSD
|
24875 |
13-Apr-1997 |
kato |
Access correct union mount point in union_access.
|
24858 |
13-Apr-1997 |
phk |
The function union_fsync tries to lock overlaying vnode object when dolock is not set (that is, targetvp == overlaying vnode object). Current code use FIXUP macro to do this, and never unlocks overlaying vnode object in union_fsync. So, the vnode object will be locked twice and never unlocked.
PR: 3271 Submitted by: kato
|
24857 |
13-Apr-1997 |
phk |
The path name buffer, cn->cn_pnbuf, is FREEed by VOP_MKDIR when relookup() in union_relookup() is succeeded. However, if relookup() returns non-zero value, that is relookup fails, VOP_MKDIR is never called (c.f. union_mkshadow). Thus, pathname buffer is never FREEed.
Reviewed by: phk Submitted by: kato PR: 3262
|
24856 |
13-Apr-1997 |
phk |
Though malloc allocates only cn.cn_namelen bytes for cn.cn_pnbuf in union_vn_create(), following bcopy copies cn.cn_namlen + 1 bytes to cn.cn_pnbuf
PR: 3255 Reviewed by: phk Submitted by: kato
|
24788 |
10-Apr-1997 |
bde |
Get the declaration of `struct dirent' from <sys/dirent.h>, not from <sys/dir.h>, and use the new macro GENERIC_DIRSIZ() instead of DIRSIZ().
Removed unused #includes.
|
24787 |
10-Apr-1997 |
bde |
Get the declaration of `struct dirent' from <sys/dirent.h>, not from <sys/dir.h>.
Removed unused #include.
Fixed type and order of struct members in pseudo-declaration of `struct vop_readdir_args'.
|
24785 |
10-Apr-1997 |
bde |
Removed unused or apparently-unused #includes, especially of the deprecated header <sys/dir.h>.
|
24666 |
06-Apr-1997 |
dyson |
Fix the gdb executable modify problem. Thanks to the detective work by Alan Cox <alc@cs.rice.edu>, and his description of the problem.
The bug was primarily in procfs_mem, but the mistake likely happened due to the lack of vm system support for the operation. I added better support for selective marking of page dirty flags so that vm_map_pageable(wiring) will not cause this problem again.
The code in procfs_mem is now less bogus (but maybe still a little so.)
|
24205 |
24-Mar-1997 |
bde |
Don't include <sys/ioctl.h> in the kernel. Stage 3: include <sys/filio.h> instead of <sys/ioctl.h> in non-network non-tty files.
|
24203 |
24-Mar-1997 |
bde |
Don't include <sys/ioctl.h> in the kernel. Stage 1: don't include it when it is not used. In most cases, the reasons for including it went away when the special ioctl headers became self-sufficient.
|
24131 |
23-Mar-1997 |
bde |
Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined. Fixed everything that depended on getting fcntl.h stuff from the wrong place. Most things don't depend on file.h stuff at all.
|
23997 |
18-Mar-1997 |
peter |
Restore the lost MNT_LOCAL flag twiddle. Lite2 has a different mechanism of setting it (compiled into vfs_conf.c), but we have a dynamic system in place. This could probably be better done via a runtime configure flag in the VFS_SET() VFS declaration, perhaps VFCF_LOCAL, and have the VFS code propagate this down into MNT_LOCAL at mount time. The other FS's would need to be updated, havinf UFS and MSDOSFS filesystems without MNT_LOCAL breaks a few things.. the man page rebuild scans for local filesystems and currently fails, I suspect that other tools like find and tar with their "local filesystem only" modes might be affected.
|
23527 |
08-Mar-1997 |
bde |
Use the common nchstats struct instead of a private one for ncs_2passes and ncs_pass2. The public one is already used for other cd9660 statistics and the private one was effectively invisible.
|
23526 |
08-Mar-1997 |
bde |
Fixed missing initialisation of vp->v_type for types Pfile and Pmem in procfs_allocvp(). This fixes at least stat() of /proc/*/mem.
stat() of /proc/*/file already worked. I think procfs_allocvp() isn't actually called for type Pfile.
|
23351 |
03-Mar-1997 |
bde |
Don't export kernel interfaces to applications. msdosfs_mount probably didn't compile before this change.
Added idempotency ifdef.
|
23134 |
26-Feb-1997 |
bde |
Updated msdosfs to use Lite2 vfs configuration and Lite2 locking. It should now work as (un)well as before the Lite2 merge.
|
23077 |
24-Feb-1997 |
bde |
Fixed procfs's locking vops. They were missed in the Lite2 merge, partly because the #define's for them were moved to a different file. At least the null VOP_LOCK() no longer works, since vclean() expects VOP_LOCK( ..., LK_DRAIN | LK_INTERLOCK, ...) to clear the interlock. This probably only matters when simple_lock() is not null, i.e., when there are multiple CPUs or SIMPLELOCK_DEBUG is defined.
|
22975 |
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
22620 |
13-Feb-1997 |
bde |
Killed more FIFO ifdefs. All gone now.
|
22618 |
13-Feb-1997 |
bde |
Removed bogus B_AGE policy again (see rev 1.4).
Removed FIFO ifdef again (see rev.1.8). This also fixes vfs initialization since the VNODEOP_SET() was inside the ifdef.
|
22607 |
12-Feb-1997 |
mpp |
Eliminate the last of the compile warnings in this module by correctly casting the arguments to all of the null_bypass() calls.
|
22605 |
12-Feb-1997 |
mpp |
Restore of #include <sys/kernel.h> so that this compiles without warnings again.
|
22601 |
12-Feb-1997 |
mpp |
Make this compile without warnings after the Lite2 merge:
- *fs_init routines now take a "struct vfsconf * vfsp" pointer as an argument. - Use the correct type for cookies. - Update function prototypes.
Submitted by: bde
|
22600 |
12-Feb-1997 |
mpp |
Rstored #include of <sys/kernel.h> so that this compiles without warnings again.
Submitted by: bde
|
22597 |
12-Feb-1997 |
mpp |
Make this compile again after the Lite2 merge. Also add missing function prototypes.
|
22596 |
12-Feb-1997 |
mpp |
Add missing function prototypes.
|
22595 |
12-Feb-1997 |
bde |
Added parameter names to prototypes that were added in the last commit to match nearby style.
|
22594 |
12-Feb-1997 |
bde |
Restored #include of <sys/kernel.h> so that this compiles again.
|
22593 |
12-Feb-1997 |
bde |
Declare function args in order in recently K&Rised function headers.
|
22582 |
12-Feb-1997 |
mpp |
Add function protypes for the new Lite2 unionfs functions.
|
22579 |
12-Feb-1997 |
mpp |
Add function prototypes for most of the new Lite2 functions. Also made a few of the miscfs routines static to be consistent. Some modules simply required some additional #includes to remove -Wall warnings.
|
22567 |
11-Feb-1997 |
bde |
Restored one line of "High Sierra" changes from rev.1.8.
The Lite2 changes in cd9660 are scarey. I probably missed some other lossage in this file.
|
22566 |
11-Feb-1997 |
bde |
Restored one line of "High Sierra" changes from rev.1.6 which was blown away by the previous commit.
Not restored: trailing whitespace changes from rev.1.7. Not restored: -Wall cleanup from rev.1.5.
|
22565 |
11-Feb-1997 |
bde |
Removed High Sierra task from TODO list. Joerg did it years ago and other items were removed from the list when they were done in the Lite2 merge. The Lite2 merge just broke the High Sierra changes.
|
22521 |
10-Feb-1997 |
dyson |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed.
Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
21754 |
16-Jan-1997 |
dyson |
Change the map entry flags from bitfields to bitmasks. Allows for some code simplification.
|
21673 |
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
21002 |
29-Dec-1996 |
dyson |
This commit is the embodiment of some VFS read clustering improvements. Firstly, now our read-ahead clustering is on a file descriptor basis and not on a per-vnode basis. This will allow multiple processes reading the same file to take advantage of read-ahead clustering. Secondly, there previously was a problem with large reads still using the ramp-up algorithm. Of course, that was bogus, and now we read the entire "chunk" off of the disk in one operation. The read-ahead clustering algorithm should use less CPU than the previous also (I hope :-)).
NOTE: THAT LKMS MUST BE REBUILT!!!
|
20910 |
25-Dec-1996 |
bde |
Don't synchronously update the directory entry at the end of every successful write. Only do it for the IO_SYNC case (like ufs). On one of my systems, this speeds up `iozone 24 512' from 32K/sec (1/128 as fast as ufs) to 2.8MB/sec (7/10 as fast as ufs).
Obtained from: partly from NetBSD
|
20691 |
19-Dec-1996 |
bde |
Fixed lseek() on named pipes. It always succeeded but should always fail. Broke locking on named pipes in the same way as locking on non-vnodes (wrong errno). This will be fixed later.
The fix involves negative logic. Named pipes are now distinguished from other types of files with vnodes, and there is additional code to handle vnodes and named pipes in the same way only where that makes sense (not for lseek, locking or TIOCSCTTY).
|
20687 |
19-Dec-1996 |
bde |
Fixed errno for unsupported advisory locks. The errno is now EINVAL fcntl() and EOPNOTSUPP for flock(). POSIX specifies the weaker EINVAL errno and the man page agrees.
Not fixed: deadfs: always returns wrong EBADF devfs, msdosfs: always return sometimes-wrong EINVAL cd9660, fdesc, kernfs, portal: always return sometimes-wrong EOPNOTSUPP procfs: always returns wrong EIO mfs: panic?! nfs: fudged
NetBSD uses a generic file system genfs to do return the sometimes-wrong EOPNOTSUPP more consistently :-)(.
Found by: NIST-PCTS
|
20138 |
04-Dec-1996 |
bde |
Fixed an off by 1 error in unix2dostime(). The first day of each month was converted to the last day of the previous month. This bug was introduced in the optimizations in rev.1.4.
|
19261 |
30-Oct-1996 |
dyson |
Fix a potential deadlock from the previous commit.
|
19260 |
30-Oct-1996 |
dyson |
Fix the /proc/???/map file so that it is possible to read an arbitrarily large process map. Another commit will follow to fix a problem just found during this one... Sorry!!! :-(.
|
19141 |
24-Oct-1996 |
dyson |
Fix setting breakpoints in shared regions.
|
19067 |
20-Oct-1996 |
alex |
Fix signed/unsigned comparison warnings.
Reviewed by: bde
|
18775 |
06-Oct-1996 |
dyson |
Substitution of a long divide by a shift. Other cosmetic improvements. Submitted by: bde
|
18640 |
02-Oct-1996 |
dyson |
MSDOS FS used to allocate a buffer before extending the VM object. In certain error conditions, it is possible for pages to be left allocated in the object beyond it's end. It is generally bad practice to allocate pages beyond the end of an object.
|
18413 |
20-Sep-1996 |
nate |
Whoops, I should've used the LINT config file. More ts -> tv changes for timespec structure.
|
18412 |
20-Sep-1996 |
nate |
Whoops, I should've used the LINT config file. More ts -> tv changes for timespec structure.
|
18397 |
19-Sep-1996 |
nate |
In sys/time.h, struct timespec is defined as:
/* * Structure defined by POSIX.4 to be like a timeval. */ struct timespec { time_t ts_sec; /* seconds */ long ts_nsec; /* and nanoseconds */ };
The correct names of the fields are tv_sec and tv_nsec.
Reminded by: James Drobina <jdrobina@infinet.com>
|
18020 |
03-Sep-1996 |
bde |
Eliminated nested include of <sys/unistd.h> in <sys/file.h> in the kernel. Include it directly in the few places where it is used.
Reduced some #includes of <sys/file.h> to #includes of <sys/fcntl.h> or nothing.
|
17974 |
31-Aug-1996 |
bde |
Fixed the easy cases of const poisoning in the kernel. Cosmetic.
|
17761 |
21-Aug-1996 |
dyson |
Even though this looks like it, this is not a complex code change. The interface into the "VMIO" system has changed to be more consistant and robust. Essentially, it is now no longer necessary to call vn_open to get merged VM/Buffer cache operation, and exceptional conditions such as merged operation of VBLK devices is simpler and more correct.
This code corrects a potentially large set of problems including the problems with ktrace output and loaded systems, file create/deletes, etc.
Most of the changes to NFS are cosmetic and name changes, eliminating a layer of subroutine calls. The direct calls to vput/vrele have been re-instituted for better cross platform compatibility.
Reviewed by: davidg
|
17314 |
28-Jul-1996 |
ache |
bzero reserved field into directory entry, junk here cause scandisk error under Win95
|
17306 |
27-Jul-1996 |
dyson |
Modify slightly the output from the map file in /proc. Now the executable bit is shown.
|
17303 |
27-Jul-1996 |
dyson |
Under certain circumstances, reading the /proc/*/map file can crash the system. Nonexistant objects were not handled correctly.
|
17296 |
27-Jul-1996 |
dyson |
Remove a totally unneeded (and as of the last VM commit, incorrect) call to pmap_clear_modify.
|
16901 |
02-Jul-1996 |
dyson |
Implement locking for pfs nodes, when at the leaf. Concurrent access to information from a single process causes hangs. Specifically, this fixes problems (hangs) with concurrent ps commands, when the system is under heavy memory load. Reviewed by: davidg
|
16889 |
02-Jul-1996 |
dyson |
Fix a serious problem, with a window where an object lock is needed, but not there. The extent of the object lock is expanded to be over the range that it is needed. Additionally, clean up the code so that it conforms to better coding style.
|
16476 |
18-Jun-1996 |
dyson |
Add procfs_type.c to the repository.
|
16474 |
18-Jun-1996 |
dyson |
Clean-up the new VM map procfs code, and also add support for executable format file "etype". It contains a description of the binary type for a process.
|
16468 |
17-Jun-1996 |
dyson |
This file is the "meat" of the process address space capability. If you would like other things added, just ask!!! It might be pretty easy to add.
|
16467 |
17-Jun-1996 |
dyson |
Add a feature to procfs to allow display of the process address map with multiple entries as follows:
start address, end address, resident pages in range, private pages in range, RW/RO, COW or not, (vnode/device/swap/default).
|
16363 |
14-Jun-1996 |
asami |
The Great PC98 Merge.
All new code is "#ifdef PC98"ed so this should make no difference to PC/AT (and its clones) users.
Ok'd by: core Submitted by: FreeBSD(98) development team
|
16322 |
12-Jun-1996 |
gpalmer |
Clean up -Wunused warnings.
Reviewed by: bde
|
16312 |
12-Jun-1996 |
dg |
Moved the fsnode MALLOC to before the call to getnewvnode() so that the process won't possibly block before filling in the fsnode pointer (v_data) which might be dereferenced during a sync since the vnode is put on the mnt_vnodelist by getnewvnode.
Pointed out by Matt Day <mday@artisoft.com>
|
16311 |
12-Jun-1996 |
dg |
Moved the fsnode MALLOC to before the call to getnewvnode() so that the process won't possibly block before filling in the fsnode pointer (v_data) which might be dereferenced during a sync since the vnode is put on the mnt_vnodelist by getnewvnode.
|
16308 |
11-Jun-1996 |
dyson |
Properly lock the vm space when accessing the memory in a process. This fix could solve some "interesting" problems that could happen during process rundown.
|
15538 |
02-May-1996 |
phk |
First pass at cleaning up macros relating to pages, clusters and all that.
|
15055 |
05-Apr-1996 |
ache |
Fix adjkerntz expression priority. Make filetimes the same as DOS times for UTC cmos clock.
|
15053 |
05-Apr-1996 |
ache |
Don't adjust file times for UTC clock to have the same timestamps for DOS/FreeBSD.
|
15033 |
03-Apr-1996 |
gpalmer |
add a `Warning:' to the message saying that the root directory is not a multiple of the clustersize in length to try and reduce the number of questions we get on the subject.
|
14693 |
19-Mar-1996 |
dyson |
Fix the problem that unmounting filesystems that are backed by a VMIO device have reference count problems. We mark the underlying object ono-persistent, and account for the reference count that the VM system maintainsfor the special device close. This should fix the removable device problem.
|
14625 |
14-Mar-1996 |
joerg |
Provide a better handling of partially corrupted directory entries.
Submitted by: bde
|
14553 |
11-Mar-1996 |
peter |
Import 4.4BSD-Lite2 onto the vendor branch, note that in the kernel, all files are off the vendor branch, so this should not change anything.
A "U" marker generally means that the file was not changed in between the 4.4Lite and Lite-2 releases, and does not need a merge. "C" generally means that there was a change. [note, new file: cd9660_mount.h]
|
14532 |
11-Mar-1996 |
hsu |
For Lite2: proc LIST changes. Reviewed by: davidg & bde
|
14434 |
09-Mar-1996 |
dyson |
Make sure that the zero flag is cleared upon completion of paging I/O.
|
14093 |
13-Feb-1996 |
wollman |
Kill XNS. While we're at it, fix socreate() to take a process argument. (This was supposed to get committed days ago...)
|
13838 |
02-Feb-1996 |
wosch |
add ruid and rgid to file 'status'
|
13765 |
30-Jan-1996 |
mpp |
Fix a bunch of spelling errors in the comment fields of a bunch of system include files.
|
13627 |
25-Jan-1996 |
peter |
This time, really make the procfs work when reading stuff from the UPAGES.
This is a really ugly bandaid on the problem, but it works well enough for 'ps -u' to start working again. The problem was caused by the user address space shrinking by a little bit and the UPAGES being "cast off" to become a seperate entity rather than being at the top of the process's vmspace. That optimization was part of John's most recent VM speedups.
Now, rather than decoding the VM space, it merely ensures the pages are in core and accesses them the same way the ptrace(PT_READ_U..) code does, ie: off the p->p_addr pointer.
|
13608 |
24-Jan-1996 |
peter |
Major fixes for procfs..
Implement a "variable" directory structure. Files that do not make sense for the given process do not "appear" and cannot be opened. For example, "system" processes do not have "file", "regs" or "fpregs", because they do not have a user area.
"attempt" to fill in the user area of a given process when it is being accessed via /proc/pid/mem (the user struct is just after VM_MAXUSER_ADDRESS in the process address space.)
Dont do IO to the U area while it's swapped, hold it in place if possible.
Lock off access to the "ctl" file if it's done a setuid like the other pseudo-files in there.
|
13490 |
19-Jan-1996 |
dyson |
Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
|
13260 |
05-Jan-1996 |
wollman |
Convert QUOTA to new-style option.
|
13160 |
01-Jan-1996 |
phk |
I have some problem here, which shows up in the ahc0 driver. It isn't where it originates, so I catch it here and fail. This may expose the same bug on other disk controllers (both scsi & ide).
|
12904 |
17-Dec-1995 |
bde |
Fixed 1TB filesize changes. Some pindexes had bogus names and types but worked because vm_pindex_t is indistinuishable from vm_offset_t.
|
12820 |
14-Dec-1995 |
phk |
Another mega commit to staticize things.
|
12813 |
13-Dec-1995 |
julian |
devsw tables are now arrays of POINTERS to struct [cb]devsw seems to work hre just fine though I can't check every file that changed due to limmited h/w, however I've checked enught to be petty happy withe hte code..
WARNING... struct lkm[mumble] has changed so it might be an idea to recompile any lkm related programs
|
12771 |
11-Dec-1995 |
phk |
Back out this one, must have screwed up somewhere :-(
|
12769 |
11-Dec-1995 |
phk |
Staticize.
|
12767 |
11-Dec-1995 |
dyson |
Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.
|
12675 |
08-Dec-1995 |
julian |
Pass 3 of the great devsw changes most devsw referenced functions are now static, as they are in the same file as their devsw structure. I've also added DEVFS support for nearly every device in the system, however many of the devices have 'incorrect' names under DEVFS because I couldn't quickly work out the correct naming conventions. (but devfs won't be coming on line for a month or so anyhow so that doesn't matter)
If you "OWN" a device which would normally have an entry in /dev then search for the devfs_add_devsw() entries and munge to make them right.. check out similar devices to see what I might have done in them in you can't see what's going on.. for a laugh compare conf.c conf.h defore and after... :) I have not doen DEVFS entries for any DISKSLICE devices yet as that will be a much more complicated job.. (pass 5 :)
pass 4 will be to make the devsw tables of type (cdevsw * ) rather than (cdevsw) seems to work here.. complaints to the usual places.. :)
|
12662 |
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
12645 |
05-Dec-1995 |
bde |
Include <vm/vm.h> or <vm/vm_page.h> explicitly to avoid breaking when vnode_if.h doesn't include vm stuff.
|
12636 |
05-Dec-1995 |
bde |
Restored #include of <sys/tty.h>. fdesc_vnops.c needs to know too much about tty_tty.c's cdevswitch functions.
|
12597 |
03-Dec-1995 |
bde |
Added prototypes.
cd9660_rrip.c: Added lots of bogus casts to hide type errors exposed by the prototypes. (Different structs are assumed to have a common prefix.)
cd9660_vnops.c: Finished staticizing.
|
12596 |
03-Dec-1995 |
bde |
Added prototypes.
|
12595 |
03-Dec-1995 |
bde |
Added prototypes.
Removed some unnecessary #includes.
|
12594 |
03-Dec-1995 |
bde |
null_node_find() and umap_node_find() were sometimes called without a `struct mount *' arg. I don't know what the effects of this were.
|
12570 |
02-Dec-1995 |
phk |
staticize.
|
12520 |
29-Nov-1995 |
julian |
#ifdef out nearly the entire file of conf.c when JREMOD is defined add a few safety checks in specfs because now it's possible to get entries in [cd]devsw[] which are ALL NULL so it's better to discover this BEFORE jumping into the d_open() entry..
more check to come later.. this getsthe code to the stage where I can start testing it, even if I haven't caught every little error case... I guess I'll find them quick enough..
|
12453 |
21-Nov-1995 |
bde |
Completed function declarations and/or added prototypes.
|
12412 |
20-Nov-1995 |
dyson |
Since FreeBSD clustering code now supports filesystems < PAGE_SIZE, enable clustering for cd9660, thereby giving a BIG performance boost.
|
12373 |
18-Nov-1995 |
bde |
KNFized spec_getpages_idone() and spec_getpages().
Moved misplaced #includes.
Completed function pointer declarations.
|
12338 |
16-Nov-1995 |
bde |
Moved declarations for static functions to the correct place (not in a header) and cleaned them up.
|
12337 |
16-Nov-1995 |
bde |
Moved declarations for static functions to the correct place (not in a header).
Removed stupid comments.
|
12336 |
16-Nov-1995 |
bde |
Fixed the type of procfs_sync(). Trailing args were missing.
Fixed the type of procfs_fhtovp(). The args had little resemblance to the correct ones.
Added prototypes.
|
12335 |
16-Nov-1995 |
bde |
Fixed the type of portal_sync(). Trailing args were missing.
Fixed the type of portal_fhtovp(). The args had little resemblance to the correct ones.
Added prototypes.
|
12333 |
16-Nov-1995 |
bde |
Fixed the type of fdesc_sync(). Trailing args were missing.
Fixed the type of fdesc_fhtovp(). The args had little resemblance to the correct ones.
Added prototypes.
|
12287 |
14-Nov-1995 |
phk |
Get rid of hostnamelen variable.
|
12265 |
13-Nov-1995 |
bde |
Fixed getdirentries() on nfs mounted msdosfs's. No cookies were returned for certain common combinations of directory sizes, cluster sizes, and i/o sizes (e.g., 4K, 4K, and 4K). The fix in rev. 1.21 was incomplete.
Reviewed by: dfr Obtained from: party from NetBSD
|
12230 |
12-Nov-1995 |
dg |
Brought in the setattr call support from Lite-2 so that more correct error returns are provided.
Obtained from: 4.4BSD-Lite2
|
12228 |
12-Nov-1995 |
dg |
Fix isoilk hang caused by not checking for read-onlyness in several places. The fix for this in Lite-2 is more complete, but these quick hacks of mine are safer for now. I plan to integrate the additional Lite-2 stuff at some later time. Should completely fix PR810.
|
12203 |
11-Nov-1995 |
bde |
Removed unsed function dead_nullop().
Converted incomplete function declarations to prototypes.
|
12158 |
09-Nov-1995 |
bde |
Introduced a type `vop_t' for vnode operation functions and used it 1138 times (:-() in casts and a few more times in declarations. This change is null for the i386.
The type has to be `typedef int vop_t(void *)' and not `typedef int vop_t()' because `gcc -Wstrict-prototypes' warns about the latter. Since vnode op functions are called with args of different (struct pointer) types, neither of these function types is any use for type checking of the arg, so it would be preferable not to use the complete function type, especially since using the complete type requires adding 1138 casts to avoid compiler warnings and another 40+ casts to reverse the function pointer conversions before calling the functions.
|
12145 |
07-Nov-1995 |
phk |
missed one static thingie.
|
12144 |
07-Nov-1995 |
phk |
staticize private parts.
|
12143 |
07-Nov-1995 |
phk |
Make a lot of private stuff static. Should anybody out there wonder about this vendetta against global variables, it is basically to make it more visible what our interfaces in the kernel really are. I'm almost convinced we should have a #define PUBLIC /* public interface */ and use it in the #includes...
|
11977 |
31-Oct-1995 |
pst |
Pad out MSDOS boot block to 512 bytes (bugfix only) Submitted by: Andreas Haakh, ah@alman.RoBIN.de
|
11954 |
31-Oct-1995 |
phk |
Make a lot of stuff static.
|
11921 |
29-Oct-1995 |
phk |
Second batch of cleanup changes. This time mostly making a lot of things static and some unused variables here and there.
|
11707 |
23-Oct-1995 |
dyson |
Removal of unnecessary usage of PG_COPYONWRITE.
|
11701 |
23-Oct-1995 |
dyson |
Finalize GETPAGES layering scheme. Move the device GETPAGES interface into specfs code. No need at this point to modify the PUTPAGES stuff except in the layered-type (NULL/UNION) filesystems.
|
11644 |
22-Oct-1995 |
dg |
Moved the filesystem read-only check out of the syscalls and into the filesystem layer, as was done in lite-2. Merged in some other cosmetic changes while I was at it. Rewrote most of msdosfs_access() to be more like ufs_access() and to include the FS read-only check.
Obtained from: partially from 4.4BSD-lite2
|
11333 |
08-Oct-1995 |
swallace |
Add #include <sys/sysproto.h> to get struct close_args and close function prototype.
|
11297 |
07-Oct-1995 |
bde |
Return EINVAL instead of panicing for rename("dir1", "dir2/..").
Fixes part of PR 760.
This bug seems to be very old.
|
11262 |
06-Oct-1995 |
phk |
Avoid some 64bit divides.
|
10551 |
04-Sep-1995 |
dyson |
Added VOP_GETPAGES/VOP_PUTPAGES and also the "backwards" block count for VOP_BMAP. Updated affected filesystems...
|
10534 |
02-Sep-1995 |
mpp |
Do not allow delete/rename lookup request to prevent panics if a user attempts to remove/rename files in a fdesc file system.
|
10533 |
02-Sep-1995 |
mpp |
Correctly initialize the mount stat structure so that fdesc file systems show up in "mount" correctly and so that they can then be unmounted.
|
10531 |
02-Sep-1995 |
mpp |
Change procfs_lookup to not allow delete/rename operations to prevent panics when a user tries to remove/rename the contents of /proc/###/*.
Obtained from: 4.4BSD-lite2
|
10272 |
25-Aug-1995 |
bde |
Fix bogus arg (&p instead of p) in the call to VOP_ACCESS() from msdosfs_setattr(). The bug was benign because the arg isn't used.
|
10093 |
17-Aug-1995 |
bde |
The `cred' and `proc' args were missing for some VOP_OPEN() and VOP_CLOSE() calls.
Found by: gcc -Wstrict-prototypes after I supplied some of the 5000+ missing prototypes. Now I have 9000+ lines of warnings and errors about bogus conversions of function pointers.
|
10027 |
11-Aug-1995 |
dg |
Converted mountlist to a CIRCLEQ.
Partially obtained from: 4.4BSD-Lite2
|
10024 |
11-Aug-1995 |
dg |
Be careful not to dereference NULL credentials pointers when doing the getattr function.
|
9973 |
06-Aug-1995 |
jkh |
Allow a pipe to be opened read/write at one end, as is allowed in SunOS and SCO. You can then even use the pipe as a cheap fifo stack (yuck!). A semantic change also important (but not limited) to iBCS2 compatibility. Submitted by: swallace
|
9878 |
03-Aug-1995 |
dfr |
Make sure that a non-null cookie vector is returned even if there were no valid entries in the block. Doing otherwise confuses the nfs server.
|
9862 |
02-Aug-1995 |
dfr |
Add support for the va_filerev attribute required by NFSv3.
|
9842 |
01-Aug-1995 |
dg |
Removed my special-case hack for VOP_LINK and fixed the problem with the wrong vp's ops vector being used by changing the VOP_LINK's argument order. The special-case hack doesn't go far enough and breaks the generic bypass routine used in some non-leaf filesystems. Pointed out by Kirk McKusick.
|
9759 |
29-Jul-1995 |
bde |
Eliminate sloppy common-style declarations. There should be none left for the LINT configuation.
|
9715 |
25-Jul-1995 |
bde |
Change `extern inline' to `static inline' so that several functions don't go away when the kernel is compiled with -O.
The functions are backed up by extern versions in cd9660_util.c, but these versions are disabled by `#ifdef __notanymore__'. They could have been enabled by using `#if defined(__notanymore__) || !defined(__OPTIMIZE__)' but then I would have had to check that they still work. The correct way to handle all this is to replace `extern inline' by `EXTERN_INLINE' and define `EXTERN_INLINE' as `extern inline' in most modules and as empty in one module.
|
9542 |
16-Jul-1995 |
joerg |
There is a small bug in the cd9660 code that prevents stating of associated files.
Submitted by: leo@dachau.marco.de (Matthias Pfaller) Not-obtained from: NetBSD. Instead sent directly to me by Matthias. (Sorry, this is to prevent people from claiming i might have gotten this from NetBSD. :)
|
9540 |
16-Jul-1995 |
bde |
Don't include <sys/tty.h> in drivers that aren't tty drivers or in general files that don't depend on the internals of <sys/tty.h>
|
9507 |
13-Jul-1995 |
dg |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!!
Much needed overhaul of the VM system. Included in this first round of changes:
1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers".
2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items.
3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed.
4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug.
5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance.
6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain.
7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance.
8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed.
9) Some almost useless debugging code removed.
10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology.
11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended.
12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course).
13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE.
14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes)
TODO:
1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size.
2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness.
3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind.
4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems.
5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
9435 |
08-Jul-1995 |
dg |
Added missing splx() in DIAGNOSTIC code. Suggested by enami@sys.ptg.sony.co.jp.
|
9354 |
28-Jun-1995 |
dg |
Fixed VOP_LINK argument order botch.
|
9346 |
28-Jun-1995 |
dg |
Killed the "probably_never" ifdef'd code.
|
9202 |
11-Jun-1995 |
rgrimes |
Merge RELENG_2_0_5 into HEAD
|
8876 |
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
8740 |
25-May-1995 |
dg |
Fixed panic that resulted from mmaping files in kernfs and procfs. A regular user could panic the machine with a simple "tail /proc/curproc/mem" command. The problem was twofold: both kernfs and procfs didn't fill in the mnt_stat statfs struct (which would later lead to an integer divide fault in the vnode pager), and kernfs bogusly paniced if a bmap was attempted.
Reviewed by: John Dyson
|
8624 |
19-May-1995 |
dg |
NFS diskless operation was broken because swapdev_vp wasn't initialized. These changes solve the problem in a general way by moving the initialization out of the individual fs_mountroot's and into swaponvp().
Submitted by: Poul-Henning Kamp
|
8456 |
11-May-1995 |
rgrimes |
Fix -Wformat warnings from LINT kernel.
|
8386 |
09-May-1995 |
bde |
Submitted by: Mike Pritchard <pritc003@maroon.tc.umn.edu>
msdosfs_lookup() did no validation to see if the caller was validated to delete/rename/create files. msdosfs_setattr() did no validation to see if the caller was allowed to change the file permissions (turn on/off the write bit) or update the file modification time (utimes).
The routines were fixed to validate the calls just like ufs does.
|
7835 |
15-Apr-1995 |
dg |
For P_SUGID processes, we must also change ownership of the mem file to root so that group kmem can still get to it. *SIGH*
|
7833 |
15-Apr-1995 |
dg |
Retain group kmem readability for P_SUGID processes.
|
7832 |
15-Apr-1995 |
dg |
Made /proc/n/mem file group kmem and group readable. Needed to fix ps so that it doesn't need to be setuid root.
|
7760 |
11-Apr-1995 |
ache |
Fix link sys call Submitted by: pritc003@maroon.tc.umn.edu
|
7755 |
11-Apr-1995 |
bde |
Submitted by: Mike Pritchard <pritc003@maroon.tc.umn.edu>
Fix PR 303: msdosfs: moving a file into another directory causes panic.
" ... the code that does the rename already has the denode locked when msdosfs_hashins() gets called, resulting in the panic when the routine attempts to lock the denode again. ... The attached patch changes the msdosfs_hashins() routine to not lock the denode. The caller is now resposible for obtaining the lock instead of having msdosfs_hashins() do it for them."
|
7754 |
11-Apr-1995 |
bde |
Submitted by: Wolfgang Solfrank <ws@tools.de>
Fix off-by-1-sector error in the range checking for the end of the root directory. It was possible for the root directory to overwrite the FAT.
|
7695 |
09-Apr-1995 |
dg |
Changes from John Dyson and myself:
Fixed remaining known bugs in the buffer IO and VM system.
vfs_bio.c: Fixed some race conditions and locking bugs. Improved performance by removing some (now) unnecessary code and fixing some broken logic. Fixed process accounting of # of FS outputs. Properly handle NFS interrupts (B_EINTR).
(various) Replaced calls to clrbuf() with calls to an optimized routine called vfs_bio_clrbuf().
(various FS sync) Sync out modified vnode_pager backed pages.
ffs_vnops.c: Do two passes: Sync out file data first, then indirect blocks.
vm_fault.c: Fixed deadly embrace caused by acquiring locks in the wrong order.
vnode_pager.c: Changed to use buffer I/O system for writing out modified pages. This should fix the problem with the modification date previous not getting updated. Also dramatically simplifies the code. Note that this is going to change in the future and be implemented via VOP_PUTPAGES().
vm_object.c: Fixed a pile of bugs related to cleaning (vnode) objects. The performance of vm_object_page_clean() is terrible when dealing with huge objects, but this will change when we implement a binary tree to keep the object pages sorted.
vm_pageout.c: Fixed broken clustering of pageouts. Fixed race conditions and other lockup style bugs in the scanning of pages. Improved performance.
|
7465 |
29-Mar-1995 |
ache |
Fix timestamps when using Wall CMOS clock, optimize dos2unixtime() Submitted by: pritc003@maroon.tc.umn.edu
|
7430 |
28-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) that I didn't notice when I fixed "all" such warnings before.
|
7429 |
28-Mar-1995 |
phk |
Readdir on a CDrom would return bogus "d_type" values, potentially confusing everybody (incl find(1) ?). Initialize it to DT_UNKNOWN. Maybe we can do better, but I don't have the time.
|
7170 |
19-Mar-1995 |
dg |
Removed redundant newlines that were in some panic strings.
|
7161 |
19-Mar-1995 |
dg |
Removed bogus, commented out, call to vnode_pager_uncache().
|
7095 |
16-Mar-1995 |
wollman |
Add four more filesystem flags:
VFCF_NETWORK (this FS goes over the net) VFCF_READONLY (read-write mounts do not make any sense) VFCF_SYNTHETIC (data in this FS is not real) VFCF_LOOPBACK (this FS aliases something else)
cd9660 is readonly; nullfs, umapfs, and union are loopback; NFS is netowkr; procfs, kernfs, and fdesc are synthetic.
|
7090 |
16-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
6603 |
21-Feb-1995 |
bde |
Obtained from: memories of 1.1.5
Fix the sign of the timezone offset again.
|
6569 |
20-Feb-1995 |
dg |
Make sure process isn't swapped when messing with it. Added missing newline to log() call.
|
6364 |
14-Feb-1995 |
phk |
YFfix
|
6339 |
13-Feb-1995 |
phk |
strategy for block and char devices are rightfully spec_strategy. I feel like yanking all the "ISODEVMAP" stuff altogether, it looks like a bad kludge...
|
6303 |
10-Feb-1995 |
bde |
Use the correct block number for updating the backup copy of the FAT when deleting a file. Deleting a large file used to scramble the backup copy.
|
6151 |
03-Feb-1995 |
dg |
Fixed bmap run-length brokeness. Use bmap run-length extension when doing clustered paging.
Submitted by: John Dyson
|
6001 |
29-Jan-1995 |
ats |
Kill the comment in a comment to shut up the compiler.
|
5651 |
16-Jan-1995 |
joerg |
Roll in my changes to make the cd9660 code understand the older (original "High Sierra") CD format. I've already implemented this for 1.1.5.1 (and posted to -hackers), but didn't get any response to it. Perhaps i'm the only one who has such an old CD lying around...
Everything is done empirically, but i had three of them around (from different vendors), so there's a high probability that i've got it right. :)
|
5455 |
09-Jan-1995 |
dg |
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme.
vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering.
vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption.
vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs.
vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore.
machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme.
machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers.
Submitted by: John Dyson and David Greenman
|
5403 |
05-Jan-1995 |
dg |
Initialize map start hint to vm_map_find()...not doing so will cause it to fail if the random thing on the stack happens to be too large.
Submitted by: David Jones <dej@qpoint.torfree.net>
|
5312 |
31-Dec-1994 |
ache |
Fix problem when attached process detached Submitted by: Gary Jennejohn
|
5241 |
27-Dec-1994 |
bde |
Fix panic for `cp -p' by root to an msdos file system. Improve handling of attributes so that `cp -p' to an msdos file system can succeed under favourable circumstances (no uid or gid changes and no nonzero flags except SF_ARCHIVED).
msdosfs_vnops.c: The in-core inode flags were confused with the on-disk inode flags, so chflags() clobbered the lock flag and caused a panic.
denode.h, msdosfs_denode.c, msdosfs_vnops.c: Support the msdosfs archive attibute (ATTR_ARCHIVE) by mapping it to the complement of the SF_ARCHIVED flag and setting the ATTR_ARCHIVE bit when a file's modification time is set (but not when a file's permissions are set; this is the standard wrong DOS behaviour).
denode.h, msdosfs_denode.c: Remove the DE_UPDAT() macro. It was only used once, and the corresponding macro in ufs has already been removed.
denode.h: Don't change the timestamp for directories in DE_TIMES() (be consistent with deupdat()).
msdosfs_vnops.c: Handle chown() better: return EPERM instead of EINVAL if there are insufficient permissions; otherwise, allow null changes.
|
5083 |
12-Dec-1994 |
bde |
Fix numerous timestamp bugs.
DE_UPDATE was confused with DE_MODIFIED in some places (they do have confusing names). Handle them exactly the same as IN_UPDATE and IN_MODIFIED. This fixes chmod() and chown() clobbering the mtime and other bugs.
DE_MODIFIED was set but not used.
Parenthesize macro args.
DE_TIMES() now takes a timeval arg instead of a timespec arg. It was stupid to use a macro for speed and do unused conversions to prepare for the macro.
Restore the left shifting of the DOS seconds count by 1. It got lost among the shifts for the bitfields, so DOS seconds counts appeared to range from 0 to 29 seconds (step 1) instead of 0 to 58 seconds (step 2).
Actually use the passed-in mtime in deupdat() as documented so that utimes() works.
Change `extern __inline's to `static inline's so that msdosfs_fat.o can be linked when it is compiled without -O.
Remove faking of directory mtimes to always be the current time. It's more surprising for directory mtimes to change when you read the directories than for them not to change when you write the directories. This should be controlled by a mount-time option if at all.
|
4868 |
29-Nov-1994 |
ache |
Restore mv check, cause panic without it Submitted by: Ade Barkah
|
4463 |
14-Nov-1994 |
bde |
Undo a previous change. <sys/disklabel.h> was broken, not these files.
|
4456 |
14-Nov-1994 |
bde |
Remove the bogus include of <sys/dkbad.h>.
|
4140 |
04-Nov-1994 |
dg |
From tim@cs.city.ac.uk (Tim Wilkinson):
Find enclosed a short bugfix to get the union filesystem up and running in FreeBSD-current. We don't think we've got all the problems yet but these fixes sort out the major ones (which mostly concert bad locking of vnodes), no doubt we'll post others as necessary. Known problems include the inability of the umount command (not the system call) to unmount unions in certain circumstances (this is due the way "realpath" works), and the failure of direntries to always get all available files in unioned subdirectories. We are, as they say, working on it.
Submitted by: tim@cs.city.ac.uk (Tim Wilkinson)
|
4057 |
01-Nov-1994 |
jkh |
Fix from John Hay to avoid kernel panics when ap->a_eofflag is NULL. I'm not sure if this is just masking another problem (like, should ap->a_eofflag EVER be NULL?), but if it prevents a panic for now then it may save an ALPHA customer. Submitted by: jhay
|
3962 |
28-Oct-1994 |
jkh |
From: fredriks@mcs.com (Lars Fredriksen) ... It turns out that these files do not include <sys/dkbad.h> before <sys/disklabel.h>. Submitted by: fredriks
|
3935 |
27-Oct-1994 |
pst |
Set the EOF flag properly. Obtained from: netbsd-bugs mailing list
|
3805 |
23-Oct-1994 |
martin |
Fixed panic when unmounting floppy msdos filesystems. Problem was we weren't flushing dirty buffers. Fix stolen from ffs_fsync()
|
3687 |
18-Oct-1994 |
dg |
Fixed bug I just introduced that would have allowed a user to clobber his kernel stack.
|
3685 |
18-Oct-1994 |
dg |
Allow upages to be paged in/accessed.
Submitted by: John Dyson
|
3498 |
10-Oct-1994 |
phk |
Cosmetics. Silence gcc -Wall
|
3496 |
10-Oct-1994 |
phk |
Cosmetics. reduce the noise from gcc -Wall.
|
3442 |
08-Oct-1994 |
phk |
Cosmetics: added a #include and a static prototype to silence gcc.
|
3396 |
06-Oct-1994 |
dg |
Use tsleep() rather than sleep so that 'ps' is more informative about the wait.
|
3311 |
02-Oct-1994 |
phk |
GCC cleanup. Reviewed by: Submitted by: Obtained from:
|
3167 |
28-Sep-1994 |
dfr |
Make NFS ask the filesystems for directory cookies instead of making them itself.
|
3152 |
27-Sep-1994 |
phk |
Added declarations, fixed bugs due to missing decls. At least one of them could panic a system. (I know, it paniced mine!).
|
3106 |
26-Sep-1994 |
gpalmer |
Alterations to silence gcc -Wall. Some unused variables deleted.
Reviewed by: davidg
|
3054 |
24-Sep-1994 |
dg |
1) Added "." and ".." entries. 2) Fixed directory size to return something reasonable. 3) Disabled "file" until the code is completed. 4) Corrected directory link counts.
|
3034 |
23-Sep-1994 |
dg |
Include <sys/kernel.h> not <kernel.h>
|
2979 |
22-Sep-1994 |
wollman |
More loadable VFS changes:
- Make a number of filesystems work again when they are statically compiled (blush)
- FIFOs are no longer optional; ``options FIFO'' removed from distributed config files.
|
2960 |
21-Sep-1994 |
wollman |
Fix a few niggling little bugs:
- set args->lkm_offset correctly so that VFS modules can be unloaded - initialize _fs_vfsops.vfc_refcount correctly so that VFS modules can be unloaded - include kernel.h in a few placves to get the correct definition of DATA_SET
|
2946 |
21-Sep-1994 |
wollman |
Implemented loadable VFS modules, and made most existing filesystems loadable. (NFS is a notable exception.)
|
2899 |
19-Sep-1994 |
dfr |
Changed some NetBSD backwards compatibility code which was confusing mountd.
|
2893 |
19-Sep-1994 |
dfr |
Added msdosfs.
Obtained from: NetBSD
|
2807 |
15-Sep-1994 |
bde |
Supply prototypes for some functions that were implicitly declared and fix the resulting warnings.
|
2806 |
15-Sep-1994 |
bde |
Obtained from:
Remove the unnecessary inclusion of disklabel.h in cd9660_vfsops.c so that I don't have to worry about the latter when changing disklabel.h.
Supply prototypes for some functions that were implicitly declared and fix the resulting warnings and errors (timevals were punned to timespecs).
|
2610 |
09-Sep-1994 |
dg |
Relaxed panic in fdesc_setattr() to just return error.
|
2609 |
09-Sep-1994 |
dg |
Fixed off by one error in referencing an array.
Stolen from: NetBSD
|
2604 |
09-Sep-1994 |
dfr |
Fixed some confusion between the size of a logical block and the size of a device block which was stopping symbolic links working.
cd9660_readdir was incorrectly casting a pointer to the d_namlen field of a struct dirent to a (u_short*) which caused the directory entries "." and ".." to read incorrectly.
Submitted by: dfr
|
2152 |
20-Aug-1994 |
dg |
Implemented filesystem clean bit via:
machdep.c: Changed printf's a little and call vfs_unmountall() if the sync was successful.
cd9660_vfsops.c, ffs_vfsops.c, nfs_vfsops.c, lfs_vfsops.c: Allow dismount of root FS. It is now disallowed at a higher level.
vfs_conf.c: Removed unused rootfs global.
vfs_subr.c: Added new routines vfs_unmountall and vfs_unmountroot. Filesystems are now dismounted if the machine is properly rebooted.
ffs_vfsops.c: Toggle clean bit at the appropriate places. Print warning if an unclean FS is mounted.
ffs_vfsops.c, lfs_vfsops.c: Fix bug in selecting proper flags for VOP_CLOSE().
vfs_syscalls.c: Disallow dismounting root FS via umount syscall.
|
2142 |
20-Aug-1994 |
dg |
1) cleaned up after Garrett - fixed more redundant declarations, changed use of timeout_t -> timeout_func_t in aha1542 and aha1742 drivers. 2) fix a bug in the portalfs that was uncovered by better prototyping - specifically, the time must be converted from timeval to timespec before storing in va_atime. 3) fixed/added some miscellaneous prototypes
|
2112 |
18-Aug-1994 |
wollman |
Fix up some sloppy coding practices:
- Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above.
NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
|
1937 |
08-Aug-1994 |
dg |
Changed B_AGE policy to work correctly in a world with relatively large buffer caches. The old policy generally ended up caching nothing.
|
1817 |
02-Aug-1994 |
dg |
Added $Id$
|
1549 |
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
1541 |
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|