#
c662306e |
|
20-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
Add kern_openatfp(9) Reviewed by: markj, pjd Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43529
|
#
3d59b93b |
|
20-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
kern_openat(): minor style fixes Reviewed by: markj, pjd Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43529
|
#
2a284076 |
|
20-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
kern_openat(): rename fd argument to dirfd Reviewed by: markj, pjd Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43529
|
#
b068bb09 |
|
07-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
Add vnode_pager_clean_{a,}sync(9) Bump __FreeBSD_version for ZFS use. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43356
|
#
29363fb4 |
|
23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags. Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
|
#
15a51d3a |
|
28-Sep-2023 |
Mariusz Zaborski <oshogbo@FreeBSD.org> |
copy_file_range: require CAP_SEEK capability When using copy_file_range(2) with an offset parameter, the CAP_SEEK capability should be required. This requirement is similar to the behavior observed with pread(2)/pwrite(2). Reported by: theraven Reviewed by: emaste, theraven, kib, markj Approved by: secteam Differential Revision: https://reviews.freebsd.org/D41967
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
9c3bfe2a |
|
10-Jul-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
Revert "VFS: Remove VV_READLINK flag" and "fdescfs: improve linrdlnk mount option" This reverts commits 4a402dfe0bc44770c9eac6e58a501e4805e29413 and 3bffa2262328e4ff1737516f176107f607e7bc76. The fix will be implemented in somewhat different manner. The semantic adjustment is incompatible with linuxolator expectations. Reported and reviewed by: dchagin Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D40969
|
#
4a402dfe |
|
21-Jun-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
VFS: Remove VV_READLINK flag since its only reason to exist is removed. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D40700
|
#
a1d71ceb |
|
02-May-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
fstatat(2): restore AT_EMPTY_PATH handling Fixes: cb858340dcbf214cc4c4d78dbb741620d7b3a252 Reported by: markj Sponsored by: The FreeBSD Foundation
|
#
cb858340 |
|
28-Apr-2023 |
Dmitry Chagin <dchagin@FreeBSD.org> |
linux(4): Add a dedicated statat() implementation Get rid of calling Linux stat translation hook and specific to Linux handling of non-vnode dirfd from kern_statat(), Reviewed by: kib, mjg Differential revision: https://reviews.freebsd.org/D35474
|
#
56da4aa5 |
|
14-Dec-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: stop using SAVESTART for rename ni_startdir has never reached rename routines anyway Reviewed by: mckusick Tested by: pho Differential Revision: https://reviews.freebsd.org/D34468
|
#
a75d1ddd |
|
17-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: introduce V_PCATCH to stop abusing PCATCH
|
#
5b5b7e2c |
|
17-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: always retain path buffer after lookup This removes some of the complexity needed to maintain HASBUF and allows for removing injecting SAVENAME by filesystems. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D36542
|
#
41a0a99f |
|
16-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: slightly reorganize error handling in chroot This avoids duplicated NDFREE_NOTHING which will be of importance later.
|
#
3e0b4868 |
|
07-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: flip a condition around in kern_statat error tends to be 0.
|
#
84a0be4a |
|
16-Aug-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: plug a dead store in kern_linkat_vp Reported by: clang --analyze
|
#
31d1b816 |
|
28-May-2022 |
Dmitry Chagin <dchagin@FreeBSD.org> |
sysent: Get rid of bogus sys/sysent.h include. Where appropriate hide sysent.h under proper condition. MFC after: 2 weeks
|
#
cdb337b0 |
|
20-May-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fix copy-pasto in previous Reported by: dchagin
|
#
ec3c2257 |
|
15-May-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: call vn_truncate_locked from kern_truncate This fixes a bug where the syscall would not bump writecount. PR: 263999
|
#
6b715687 |
|
15-May-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: make sure truncate always calls NDFREE_* While here convert it to NDFREE_NOTHING.
|
#
362ff986 |
|
13-Apr-2022 |
Konstantin Belousov <kib@FreeBSD.org> |
Revert rest of a5970a529c2d95271: use vrefact() when working on fp->f_vnode Now, since O_PATH-opened file descriptors use use references instead of the hold references, vrefact() chahges from that revision can be reverted. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34906
|
#
bf13db08 |
|
12-Apr-2022 |
Konstantin Belousov <kib@FreeBSD.org> |
Mostly revert a5970a529c2d95271: Make files opened with O_PATH to not block non-forced unmount Problem is that open(O_PATH) on nullfs -o nocache is broken then, because there is no reference on the vnode after the open syscall exits. Reported and tested by: ambrisko Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
c6487446 |
|
11-Apr-2022 |
Dmitry Chagin <dchagin@FreeBSD.org> |
getdirentries: return ENOENT for unlinked but still open directory. To be more compatible to IEEE Std 1003.1-2008 (“POSIX.1”). Reviewed by: mjg, Pau Amma (doc) Differential revision: https://reviews.freebsd.org/D34680 MFC after: 2 weeks
|
#
b7262756 |
|
02-Apr-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fixup WANTIOCTLCAPS on open In some cases vn_open_cred overwrites cn_flags, effectively nullifying initialisation done in NDINIT. This will have to be fixed. In the meantime make sure the flag is passed. Reported by: jenkins Noted by: Mathieu <sigsys@gmail.com>
|
#
0c805718 |
|
24-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fix memory leak on lookup with fds with ioctl caps Reviewed by: markj PR: 262515 Noted by: firk@cantconnect.ru Differential Revision: https://reviews.freebsd.org/D34667
|
#
bb92cd7b |
|
24-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd)
|
#
513c7a6e |
|
10-Feb-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: make fget_unlocked take a thread argument Just like other fget routines. This enables embedding fd table pointer in struct thread, avoiding taking a trip through proc.
|
#
300cfb96 |
|
07-Feb-2022 |
Mark Johnston <markj@FreeBSD.org> |
file: Make fget*() and getvnode*() consistent about initializing *fpp Most fget*() functions initialize the output parameter to NULL. Make the externally visible interface behave consistently, and make fget_unlocked_seq() private to kern_descrip.c. This fixes at least one bug in a consumer, _filemon_wrapper_openat(), which assumes that getvnode() sets the output file pointer to NULL upon an error. Reported by: syzbot+01c0459408f896a5933a@syzkaller.appspotmail.com Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34190
|
#
2cee5861 |
|
28-Dec-2021 |
John Baldwin <jhb@FreeBSD.org> |
sys/kern: Use C99 fixed-width integer types. No functional change. Reviewed by: imp, kib Differential Revision: https://reviews.freebsd.org/D33630
|
#
7e1d3eef |
|
25-Nov-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove the unused thread argument from NDINIT* See b4a58fbf640409a1 ("vfs: remove cn_thread") Bump __FreeBSD_version to 1400043.
|
#
6eefabd4 |
|
22-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: improve nstat, nfstat, nlstat Optionally return errors when truncating dev_t, ino_t, and nlink_t. In the interest of code reuse, use freebsd11_cvtstat() to perform the truncation and error handling and then convert the resulting struct freebsd11_stat to struct nstat. Add missing freebsd32 compat syscalls. These syscalls require translation because struct nstat contains four instances of struct timespec which in turn contains a time_t and a long. Reviewed by: kib
|
#
2b9d052d |
|
17-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
freebsd32: fix getfsstat sign extension bugs Add freebsd32 versions of getfsstat and freebsd11_getfsstat so that bufsize is properly sign-extended if a negative value is passed. Reject negative values before passing to kern_getfsstat as a size_t. Reviewed by: kevans
|
#
b7fd8611 |
|
17-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: sprinkle in const values Add missing const qualifiers to a number of syscall arguments. Obtained from: CheriBSD Reviewed by: kevans
|
#
57093f93 |
|
09-Nov-2021 |
John Baldwin <jhb@FreeBSD.org> |
vfs: Consistently validate AT_* flags in kern_* functions. Some syscalls checked for invalid AT_* flags in sys_* and others in kern_*. Reviewed by: kib Obtained from: CheriBSD Sponsored by: The University of Cambridge, Google Inc. Differential Revision: https://reviews.freebsd.org/D32864
|
#
2b68eb8e |
|
01-Oct-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove thread argument from VOP_STAT and fo_stat.
|
#
93e05234 |
|
10-Oct-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add predicts to getvnode and getvnode_path
|
#
5fb54d2f |
|
08-Oct-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
readlinkat(2): allow O_PATH fd PR: 258856 Reported by: ashish Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32390
|
#
9446d9e8 |
|
13-Aug-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
fstatat(2): handle non-vnode file descriptors for AT_EMPTY_PATH Set NIRES_EMPTYPATH earlies, to have use of EMPTYPATH recorded even if we are going to return error. When namei_setup() refused to accept dirfd, which is not of the vnode type, and indicated by ENOTDIR error return, fall back to kern_fstat(dirfd). Reported by: dchagin Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31530
|
#
eca9ac5a |
|
09-Aug-2021 |
Mark Johnston <markj@FreeBSD.org> |
vfs: Avoid a comparison with an uninitialized field in setutimes() Some filesystems, e.g., devfs, do not populate va_birthtime in their GETATTR implementations. To handle this, make sure that va_birthtime is initialized to the quasi-standard value of { VNOVAL, 0 } before calling VOP_GETATTR. Reported by: KMSAN Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31468
|
#
0ef5eee9 |
|
03-Aug-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Add vn_lktype_write() and remove repetetive code that calculates vnode locking type for write. Reviewed by: khng, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31405
|
#
a40cf417 |
|
20-Jul-2021 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Implement unprivileged chroot This builds on recently introduced NO_NEW_PRIVS flag to implement unprivileged chroot, enabled by `security.bsd.unprivileged_chroot`. It allows non-root processes to chroot(2), provided they have the NO_NEW_PRIVS flag set. The chroot(8) utility gets a new flag, -n, which sets NO_NEW_PRIVS before chrooting. Reviewed By: kib Sponsored By: EPSRC Relnotes: yes Differential Revision: https://reviews.freebsd.org/D30130
|
#
802cf4ab |
|
14-Jun-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
namei: add NDPREINIT() macro Its intent is to do the initialization of the future part of struct nameidata which should be used across several namei() and VOPs. Right now it is NOP. Reviewed by: mckusick Discussed with: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D30041
|
#
a4b07a27 |
|
11-May-2021 |
Jason A. Harmening <jah@FreeBSD.org> |
VFS_QUOTACTL(9): allow implementation to indicate busy state changes Instead of requiring all implementations of vfs_quotactl to unbusy the mount for Q_QUOTAON and Q_QUOTAOFF, add an "mp_busy" in/out param to VFS_QUOTACTL(9). The implementation may then indicate to the caller whether it needed to unbusy the mount. Also, add stbool.h to libprocstat modules which #define _KERNEL before including sys/mount.h. Otherwise they'll pull in sys/types.h before defining _KERNEL and therefore won't have the bool definition they need for mp_busy. Reviewed By: kib, markj Differential Revision: https://reviews.freebsd.org/D30556
|
#
271fcf1c |
|
29-May-2021 |
Jason A. Harmening <jah@FreeBSD.org> |
Revert commits 6d3e78ad6c11 and 54256e7954d7 Parts of libprocstat like to pretend they're kernel components for the sake of including mount.h, and including sys/types.h in the _KERNEL case doesn't fix the build for some reason. Revert both the VFS_QUOTACTL() change and the follow-up "fix" for now.
|
#
6d3e78ad |
|
11-May-2021 |
Jason A. Harmening <jah@FreeBSD.org> |
VFS_QUOTACTL(9): allow implementation to indicate busy state changes Instead of requiring all implementations of vfs_quotactl to unbusy the mount for Q_QUOTAON and Q_QUOTAOFF, add an "mp_busy" in/out param to VFS_QUOTACTL(9). The implementation may then indicate to the caller whether it needed to unbusy the mount. Reviewed By: kib, markj Differential Revision: https://reviews.freebsd.org/D30218
|
#
a2691838 |
|
23-May-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: elide vnode locking when it is only needed for audit if possible
|
#
5d1d844a |
|
25-Apr-2021 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
kern_linkat: modify to accept AT_ flags instead of FOLLOW/NOFOLLOW This makes this API match other kern_xxxat() functions. Reviewed By: kib Sponsored By: EPSRC Differential Revision: https://reviews.freebsd.org/D29776
|
#
578c26f3 |
|
19-Apr-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
linkat(2): check NIRES_EMPTYPATH on the first fd arg Reported by: arichardson Reviewed by: markj MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29834
|
#
bbf7a4e8 |
|
07-Apr-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
O_PATH: allow vnode kevent filter on such files if VREAD access is checked as allowed during open Requested by: wulf Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323
|
#
a5970a52 |
|
03-Apr-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Make files opened with O_PATH to not block non-forced unmount by only keeping hold count on the vnode, instead of the use count. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323
|
#
8d9ed174 |
|
17-Mar-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
open(2): Implement O_PATH Reviewed by: markj Tested by: pho Discussed with: walker.aj325_gmail.com, wulf Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323
|
#
509124b6 |
|
07-Mar-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Add AT_EMPTY_PATH for several *at(2) syscalls It is currently allowed to fchownat(2), fchmodat(2), fchflagsat(2), utimensat(2), fstatat(2), and linkat(2). For linkat(2), PRIV_VFS_FHOPEN privilege is required to exercise the flag. It allows to link any open file. Requested by: trasz Tested by: pho, trasz Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D29111
|
#
42be0a7b |
|
17-Mar-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Style. Add missed spaces, wrap long lines. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323
|
#
ead7697f |
|
04-Mar-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Restore AT_RESOLVE_BENEATH support for funlinkat(2)/unlinkat(2). MFC after: 1 week Sponsored by: The FreeBSD Foundation
|
#
20e91ca3 |
|
15-Feb-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
open(2): Remove O_BENEATH and AT_BENEATH with the reasoning that the flags did not worked properly, and were not shipped in a release. O_RESOLVE_BENEATH is kept as useful. Reviewed by: markj Tested by: arichardson, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D28907
|
#
81174cd8 |
|
15-Feb-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: employ vfs_ref_from_vp in statfs and fstatfs Avoids locking and unlocking the vnode. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D28695
|
#
adf28ab4 |
|
31-Jan-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
fifo: minor comment and assert improvements. In particular, replace a note that reload through vget() is obsoleted, with explanation why this code is required. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
|
#
3b2aa360 |
|
28-Jan-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Use VOP_VPUT_PAIR() for eligible VFS syscalls. The current list is limited to the cases where UFS needs to handle vput(dvp) specially. Which means VOP_CREATE(), VOP_MKDIR(), VOP_MKNOD(), VOP_LINK(), and VOP_SYMLINK(). Reviewed by: chs, mkcusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
|
#
5753be8e |
|
13-Jan-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: add refcount argument to falloc_noinstall This lets callers avoid atomic ops by initializing the count to required value from the get go. While here add falloc_abort to backpedal from this without having to fdrop.
|
#
5171310e |
|
12-Jan-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: use finstall_refed in openat This avoids 2 atomic ops in the common case: 1 to grab an extra reference and 1 to release it.
|
#
cdb62ab7 |
|
11-Jan-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add NDFREE_NOTHING and convert several NDFREE_PNBUF callers Check the comment above the routine for reasoning.
|
#
8c9d7463 |
|
27-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: stop open-coding setting WILLBEDIR flag
|
#
002e18eb |
|
27-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add FAILIFEXISTS flag Both FreeBSD and Linux mkdir -p walk the tree up ignoring any EEXIST on the way and both are used a lot when building respective kernels. This poses a problem as spurious locking avoidably interferes with concurrent operations like getdirentries on affected directories. Work around the problem by adding FAILIFEXISTS flag. In case of lockless lookup this manages to avoid any work to begin with, there is no speed up for the locked case but perhaps this can be augmented later on. For simplicity the only supported semantics are as used by mkdir. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D27789
|
#
d48c2b8d |
|
13-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: correctly predict last fdrop on failed open Arguably since the count is guaranteed to be 1 the code should be modified to avoid the work.
|
#
85078b85 |
|
17-Nov-2020 |
Conrad Meyer <cem@FreeBSD.org> |
Split out cwd/root/jail, cmask state from filedesc table No functional change intended. Tracking these structures separately for each proc enables future work to correctly emulate clone(2) in linux(4). __FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof. Reviewed by: kib Discussed with: markj, mjg Differential Revision: https://reviews.freebsd.org/D27037
|
#
de774e42 |
|
17-Nov-2020 |
Conrad Meyer <cem@FreeBSD.org> |
linux(4): Implement name_to_handle_at(), open_by_handle_at() They are similar to our getfhat(2) and fhopen(2) syscalls. Differential Revision: https://reviews.freebsd.org/D27111
|
#
441eb16a |
|
13-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Allow some VOPs to return ERELOOKUP to indicate VFS operation restart at top level. Restart syscalls and some sync operations when filesystem indicated ERELOOKUP condition, mostly for VOPs operating on metdata. In particular, lookup results cached in the inode/v_data is no longer valid and needs recalculating. Right now this should be nop. Assert that ERELOOKUP is catched everywhere and not returned to userspace, by asserting that td_errno != ERELOOKUP on syscall return path. In collaboration with: pho Reviewed by: mckusick (previous version), markj Tested by: markj (syzkaller), pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26136
|
#
c7520caa |
|
22-Oct-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: prevent avoidable evictions on mkdir of existing directories mkdir -p /foo/bar/baz will mkdir each path component and ignore EEXIST. The NOCACHE lookup will make the namecache unnecessarily evict the existing entry, and then fallback to the fs lookup routine eventually leading namei to return an error as the directory is already there. For invocations like mkdir -p /usr/obj/usr/src/sys/GENERIC/modules this triggers fallbacks to the slowpath for concurrently executing lookups. Tested by: pho Discussed with: kib
|
#
1317da43 |
|
22-Sep-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Add O_RESOLVE_BENEATH and AT_RESOLVE_BENEATH to mimic Linux' RESOLVE_BENEATH. It is like O_BENEATH, but disables to walk out of the subtree rooted in the starting directory. O_BENEATH does not care if path walks out if it returned. Requested by: Dan Gohman <dev@sunfishcode.online> PR: 248335 Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25886
|
#
861f039d |
|
22-Sep-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Add at2cnpflags() the helper to convert AT_ flags for *at() syscalls to namei flags. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25886
|
#
c7de3d6f |
|
22-Sep-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Add NIRES_STRICTREL. Stop abusing internal namei flag NI_LCF_STRICTRELATIVE as indicator of cap-restricted lookup. Add designated returned flag NIRES_STRICTREL to inform kern_openat() that lookup was restricted. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25886
|
#
96474d2a |
|
15-Sep-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not copy vp into f_data for DTYPE_VNODE files. The pointer to vnode is already stored into f_vnode, so f_data can be reused. Fix all found users of f_data for DTYPE_VNODE. Provide finit_vnode() helper to initialize file of DTYPE_VNODE type. Reviewed by: markj (previous version) Discussed with: freqlabs (openzfs chunk) Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26346
|
#
3b444436 |
|
11-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
devfs: rework si_usecount to track opens This removes a lot of special casing from the VFS layer. Reviewed by: kib (previous version) Tested by: pho (previous version) Differential Revision: https://reviews.freebsd.org/D25612
|
#
25e42ee2 |
|
10-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the hello world stat probes from the vfs provider Interested parties can get the same information by hoooking on vop_stat.
|
#
51ea7bea |
|
07-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add VOP_STAT The current scheme of calling VOP_GETATTR adds avoidable overhead. An example with tmpfs doing fstat (ops/s): before: 7488958 after: 7913833 Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D25910
|
#
fad6dd77 |
|
29-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: elide MAC-induced locking on rename if there are no relevant hoooks
|
#
fd8c6a48 |
|
29-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: honor error code returned by mac_vnode_check_rename_from MFC after: 3 days
|
#
74f61cae |
|
10-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fix early termination of kern_getfsstat The kernel would unlock already unlocked mutex if the buffer got filled up before the mount list ended. Reported by: pho Fixes: r363069 ("vfs: depessimize getfsstat when only the count is requested")
|
#
6c69e697 |
|
10-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: depessimize getfsstat when only the count is requested This avoids relocking mountlist_mtx for each entry.
|
#
f2706588 |
|
21-Jun-2020 |
Thomas Munro <tmunro@FreeBSD.org> |
vfs: track sequential reads and writes separately For software like PostgreSQL and SQLite that sometimes reads sequentially while also writing sequentially some distance behind with interleaved syscalls on the same fd, performance is better on UFS if we do sequential access heuristics separately for reads and writes. Patch originally by Andrew Gierth in 2008, updated and proposed by me with his permission. Reviewed by: mjg, kib, tmunro Approved by: mjg (mentor) Obtained from: Andrew Gierth <andrew@tao11.riddles.org.uk> Differential Revision: https://reviews.freebsd.org/D25024
|
#
8cf8c2f6 |
|
24-Mar-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
kern_copy_file_range(): check the file type. The syscall can only operate on valid vnode types. Reported and tested by: pho Sponsored by: The FreeBSD Foundation
|
#
e126c5a3 |
|
14-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: use new capsicum helpers
|
#
7b2ff0dc |
|
13-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
Partially decompose priv_check by adding priv_check_cred_vfs_generation During buildkernel there are very frequent calls to priv_check and they all are for PRIV_VFS_GENERATION (coming from stat/fstat). This results in branching on several potential privileges checking if perhaps that's the one which has to be evaluated. Instead of the kitchen-sink approach provide a way to have commonly used privs directly evaluated.
|
#
52604ed7 |
|
03-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: remove the seq argument from fget_unlocked It is almost always NULL.
|
#
0a1427c5 |
|
03-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
ktrace: provide ktrstat_error This eliminates a branch from its consumers trading it for an extra call if ktrace is enabled for curthread. Given that this is almost never true, the tradeoff is worth it.
|
#
3ff65f71 |
|
30-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
Remove duplicated empty lines from kern/*.c No functional changes.
|
#
d53d924f |
|
30-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: keep the mount point referenced across sys_quotactl Otherwise we risk running into use-after-free. In particular this codepath ends up dropping all protection before suspending writes: ufs_quotactl -> quotaoff_inchange -> vfs_write_suspend_umnt Reported by: pho
|
#
2856d85e |
|
08-Jan-2020 |
Kyle Evans <kevans@FreeBSD.org> |
posix_fallocate: push vnop implementation into the fileop layer This opens the door for other descriptor types to implement posix_fallocate(2) as needed. Reviewed by: kib, bcr (manpages) Differential Revision: https://reviews.freebsd.org/D23042
|
#
c8b3463d |
|
07-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: reimplement deferred inactive to use a dedicated flag (VI_DEFINACT) The previous behavior of leaving VI_OWEINACT vnodes on the active list without a hold count is eliminated. Hold count is kept and inactive processing gets explicitly deferred by setting the VI_DEFINACT flag. The syncer is then responsible for vdrop. Reviewed by: kib (previous version) Tested by: pho (in a larger patch, previous version) Differential Revision: https://reviews.freebsd.org/D23036
|
#
b249ce48 |
|
03-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427
|
#
d6fee74a |
|
12-Dec-2019 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Add kern_sync(9), and make kernel code call it instead of going via sys_sync(2). Minor cleanup, no functional changes. Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D19366
|
#
abd80ddb |
|
08-Dec-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: introduce v_irflag and make v_type smaller The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715
|
#
1fccb43c |
|
19-Nov-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: change si_usecount management to count used vnodes Currently si_usecount is effectively a sum of usecounts from all associated vnodes. This is maintained by special-casing for VCHR every time usecount is modified. Apart from complicating the code a little bit, it has a scalability impact since it forces a read from a cacheline shared with said count. There are no consumers of the feature in the ports tree. In head there are only 2: revoke and devfs_close. Both can get away with a weaker requirement than the exact usecount, namely just the count of active vnodes. Changing the meaning to the latter means we only need to modify it on 0<->1 transitions, avoiding the check plenty of times (and entirely in something like vrefact). Reviewed by: kib, jeff Tested by: pho Differential Revision: https://reviews.freebsd.org/D22202
|
#
48e48578 |
|
09-Nov-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Update copy_file_range(2) to be Linux5 compatible. The current linux man page and testing done on a fairly recent linux5.n kernel have identified two changes to the semantics of the linux copy_file_range system call. Since the copy_file_range(2) system call is intended to be linux compatible and is only currently in head/current and not used by any commands, it seems appropriate to update the system call to be compatible with the current linux one. The first of these semantic changes was changed to be compatible with linux5.n by r354564. For the second semantic change, the old linux man page stated that, if infd and outfd referred to the same file, EBADF should be returned. Now, the semantics is to allow infd and outfd to refer to the same file so long as the byte ranges defined by the input file offset, output file offset and len does not overlap. If the byte ranges do overlap, EINVAL should be returned. This patch modifies copy_file_range(2) to be linux5.n compatible for this semantic change.
|
#
e7c1709a |
|
18-Aug-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: stop always overwriting ->mnt_stat in VFS_STATFS The struct is already populated on each mount (and remount). Fields are either constant or not used by filesystem in the first place. Some infrequently used functions use it to avoid having to allocate a new buffer and are left alone. The current code results in an avoidable copying single-threaded and significant cache line bouncing multithreaded While here deduplicate initial filling of the struct. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21317
|
#
bbbbeca3 |
|
24-Jul-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add kernel support for a Linux compatible copy_file_range(2) syscall. This patch adds support to the kernel for a Linux compatible copy_file_range(2) syscall and the related VOP_COPY_FILE_RANGE(9). This syscall/VOP can be used by the NFSv4.2 client to implement the Copy operation against an NFSv4.2 server to do file copies locally on the server. The vn_generic_copy_file_range() function in this patch can be used by the NFSv4.2 server to implement the Copy operation. Fuse may also me able to use the VOP_COPY_FILE_RANGE() method. vn_generic_copy_file_range() attempts to maintain holes in the output file in the range to be copied, but may fail to do so if the input and output files are on different file systems with different _PC_MIN_HOLE_SIZE values. Separate commits will be done for the generated syscall files and userland changes. A commit for a compat32 syscall will be done later. Reviewed by: kib, asomers (plus comments by brooks, jilles) Relnotes: yes Differential Revision: https://reviews.freebsd.org/D20584
|
#
de0b14f2 |
|
08-Apr-2019 |
Mariusz Zaborski <oshogbo@FreeBSD.org> |
In the unlinkat syscall, the operation is performed on the directory descriptor, not the file descriptor. The file descriptor is used only for verification so do not expect any additional capabilities on it. Reported by: antoine Tested by: antoine Discussed with: kib, emaste, bapt Sponsored by: Fudo Security
|
#
a1304030 |
|
06-Apr-2019 |
Mariusz Zaborski <oshogbo@FreeBSD.org> |
Introduce funlinkat syscall that always us to check if we are removing the file associated with the given file descriptor. Reviewed by: kib, asomers Reviewed by: cem, jilles, brooks (they reviewed previous version) Discussed with: pjd, and many others Differential Revision: https://reviews.freebsd.org/D14567
|
#
7cdb0b9d |
|
07-Feb-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix renameat(2) for CAPABILITIES kernels. When renameat(2) is used with: - absolute path for to; - tofd not set to AT_FDCWD; - the target exists kern_renameat() requires CAP_UNLINK capability on tofd, but corresponding namei ni_filecap is not initialized at all because the lookup is absolute. As result, the check was done against empty filecap and syscall fails erronously. Fix it by creating a return flags namei member and reporting if the lookup was absolute, then do not touch to.ni_filecaps at all. PR: 222258 Reviewed by: jilles, ngie Sponsored by: The FreeBSD Foundation MFC after: 1 week X-MFC-note: KBI breakage Differential revision: https://reviews.freebsd.org/D19096
|
#
4f4ef03f |
|
09-Jan-2019 |
Brooks Davis <brooks@FreeBSD.org> |
style(9): fix the indent of a return.
|
#
cc426dd3 |
|
11-Dec-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
Remove unused argument to priv_check_cred. Patch mostly generated with cocinnelle: @@ expression E1,E2; @@ - priv_check_cred(E1,E2,0) + priv_check_cred(E1,E2) Sponsored by: The FreeBSD Foundation
|
#
eba8ab0e |
|
10-Dec-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove special case handling for getfhat(fd, NULL, handle). There is no reason for it to behave differently from openat(fd, NULL). Also the handling did not worked because the substituted path was from the system address space, causing EFAULT. Submitted by: Jack Halford <jack@gandi.net> MFC after: 1 week Differential revision: https://reviews.freebsd.org/D18501
|
#
18519f15 |
|
07-Dec-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Simplify kern_readlink_vp(). When we detected that the vnode is not symlink, return immediately. This moves the readlink code out of else branch and unindents it. Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
978f8794 |
|
07-Dec-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix expression evaluation. Braces were put in the wrong place, causing failing EAGAIN check to return zero result. Remove the problematic assignment from the conditional expression at all. While there, remove used once variable vp, and wrap too long line. Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
d1fd400a |
|
07-Dec-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Add new file handle system calls. Namely, getfhat(2), fhlink(2), fhlinkat(2), fhreadlink(2). The syscalls are provided for a NFS userspace server (nfs-ganesha). Submitted by: Jack Halford <jack@gandi.net> Sponsored by: Gandi.net Tested by: pho Feedback from: brooks, markj MFC after: 1 week Differential revision: https://reviews.freebsd.org/D18359
|
#
1f6ad48c |
|
29-Nov-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fix i386 build after r341220
|
#
71277584 |
|
29-Nov-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop spurious memcpy in stat Sponsored by: The FreeBSD Foundation
|
#
12e69f96 |
|
02-Nov-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Add const to input-only char * arguments. These arguments are mostly paths handled by NAMEI*() macros which already take const char * arguments. This change improves the match between syscalls.master and the public declerations of system calls. Reviewed by: kib (prior version) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17812
|
#
4f77f488 |
|
25-Oct-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement O_BENEATH and AT_BENEATH. Flags prevent open(2) and *at(2) vfs syscalls name lookup from escaping the starting directory. Supposedly the interface is similar to the same proposed Linux flags. Reviewed by: jilles (code, previous version of manpages), 0mp (manpages) Discussed with: allanjude, emaste, jonathan Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D17547
|
#
bf94d6c7 |
|
19-Sep-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix state of dquot-less vnodes after failed quotaoff. UFS quotaoff iterates over all mp vnodes, and derefences and clears the pointers to corresponding dquots. If SU work items transiently reference some of dquots,quotaoff() would eventually fail, but all processed vnodes are already stripped from dquots. The state is problematic, since quotas are left enabled, but there is no dquots where blocks and inodes can be accounted. The result is assertion failures and NULL pointer dereferences. Fix it by suspending writes around quotaoff() call. Since the filesystem is synced, no dandling references to dquots from SU workitems can left behind, which means that quotaoff succeeds. The complication there is that quotaoff VFS op is performed with the mount point busied, while to suspend, we need to start write on the mp. If vn_start_write() is called on busied mp, system might deadlock against parallel unmount request. Handle this by unbusy-ing mp before starting write, which in turn requires changing the quotaoff() interface to return with the mount point not busied, same as was done for quotaon(). Reviewed by: mckusick Reported and tested by: pho Sponsored by: The FreeBSD Foundation Approved by: re (gjb) MFC after: 1 week Differential revision: https://reviews.freebsd.org/D17208
|
#
ab35e1c7 |
|
12-Jun-2018 |
Bruce Evans <bde@FreeBSD.org> |
Fix the encoding of major and minor numbers in 64-bit dev_t by restoring the old encodings for the lower 16 and 32 bits and only using the higher 32 bits for unusually large major and minor numbers. This change breaks compatibility with the previous encoding (which was only used in -current). Fix truncation to (essentially) 16-bit dev_t in newnfs v3. Any encoding of device numbers gives an ABI, so it can't be changed without translations for compatibility. Extra bits give the much larger complication that the translations need to compress into fewer bits. Fortunately, more than 32 bits are rarely needed, so compression is rarely needed except for 16-bit linux dev_t where it was always needed but never done. The previous encoding moved the major number into the top 32 bits. Almost no translation code handled this, so the major number was blindly truncated away in most 32-bit encodings. E.g., for ffs, mknod(8) with major = 1 and minor = 2 gave dev_t = 0x10000002; ffs cannot represent this and blindly truncated it to 2. But if this mknod was run on any released version of FreeBSD, it gives dev_t = 0x102. ffs can represent this, but in the previous encoding it was not decoded, giving major = 0, minor = 0x102. The presence of bugs was most obvious for exporting dev_t's from an old system to -current, since bugs in newnfs augment them. I fixed oldnfs to support 32-bit dev_t in 1996 (r16634), but this regressed to 16-bit dev_t in newnfs, first to the old 16-bit encoding and then further in -current. E.g., old ad0 with major = 234, minor = 0x10002 had the correct (major, minor) number on the wire, but newnfs truncated this to (234, 2) and then the previous encoding shifted the major number into oblivion as seen by ffs or old applications. I first tried to fix this by translating on every ABI/API boundary, but there are too many boundaries and too many sloppy translations by blind truncation. So use the old encoding for the low 32 bits so that sloppy translations work no worse than before provided the high 32 bits are not set. Add some error checking for when bits are lost. Keep not doing any error checking for translations for almost everything in compat/linux. compat/freebsd32/freebsd32_misc.c: Optionally check for losing bits after possibly-truncating assignments as before. compat/linux/linux_stats.c: Depend on the representation being compatible with Linux's (or just with itself for local use) and spell some of the translations as assignments in a macro that hides the details. fs/nfsclient/nfs_clcomsubs.c: Essentially the same fix as in 1996, except there is now no possible truncation in makedev() itself. Also fix nearby style bugs. kern/vfs_syscalls.c: As for freebsd32. Also update the sysctl description to include file numbers, and change it to describe device ids as device numbers. sys/types.h: Use inline functions (wrapped by macros) since the expressions are now a bit too complicated for plain macros. Describe the encoding and some of the reasons for it. 16-bit compatibility didn't leave many reasonable choices for the 32-bit encoding, and 32-bit compatibility doesn't leave many reasonable choices for the 64-bit encoding. My choice is to put the 8 new minor bits in the low 8 bits of the top 32 bits. This minimizes discontiguities. Reviewed by: kib (except for rewrite of the comment in linux_stats.c)
|
#
372639f9 |
|
13-Jun-2018 |
Bruce Evans <bde@FreeBSD.org> |
Fix some bugs found while fixing the representation and translation of 64-bit dev_t's (but not ones involving dev_t's). st_size was supposed to be clamped in cvtstat() and linux's copy_stat(), but the clamping code wasn't aware that st_size is signed, and also had an obfuscated off-by-1 value for the unsigned limit, so its effect was to produce a bizarre negative size instead of clamping. Change freebsd32's copy_ostat() to be no worse than cvtstat(). It was missing clamping and bzero()ing of padding. Reviewed by: kib (except a final fix of the clamp to the signed maximum)
|
#
71189909 |
|
19-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
Add additional preinitialized cap_rights
|
#
cbd92ce6 |
|
09-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
Eliminate the overhead of gratuitous repeated reinitialization of cap_rights - Add macros to allow preinitialization of cap_rights_t. - Convert most commonly used code paths to use preinitialized cap_rights_t. A 3.6% speedup in fstat was measured with this change. Reported by: mjg Reviewed by: oshogbo Approved by: sbruno MFC after: 1 month
|
#
6469bdcd |
|
06-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Move most of the contents of opt_compat.h to opt_global.h. opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941
|
#
b1288166 |
|
17-Jan-2018 |
John Baldwin <jhb@FreeBSD.org> |
Use long for the last argument to VOP_PATHCONF rather than a register_t. pathconf(2) and fpathconf(2) both return a long. The kern_[f]pathconf() functions now accept a pointer to a long value rather than modifying td_retval directly. Instead, the system calls explicitly store the returned long value in td_retval[0]. Requested by: bde Reviewed by: kib Sponsored by: Chelsio Communications
|
#
51369649 |
|
20-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
|
#
456a73ef |
|
22-Oct-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove the support for mknod(S_IFMT), which created dummy vnodes with VBAD type. FFS ffs_write() VOP catches such vnodes and panics, other VOPs do not check for the type and their behaviour is really undefined. The comment claims that this support was done for 'badsect' to flag bad sectors, we do not have such facility in kernel anyway. Reported by: Dmitry Vyukov <dvyukov@google.com> Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
5532aa9b |
|
12-Oct-2017 |
Ed Maste <emaste@FreeBSD.org> |
allow posix_fallocate in capability mode posix_fallocate is logically equivalent to writing zero blocks to the desired file size and there is no reason to prevent calling it in capability mode. posix_fallocate already checked for the CAP_WRITE right, so we merely need to list it in capabilities.conf. Reviewed by: allanjude MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D12640
|
#
77d3337c |
|
31-Jul-2017 |
Dmitry Chagin <dchagin@FreeBSD.org> |
Implement proper Linux /dev/fd and /proc/self/fd behavior by adding Linux specific things to the native fdescfs file system. Unlike FreeBSD, the Linux fdescfs is a directory containing a symbolic links to the actual files, which the process has open. A readlink(2) call on this file returns a full path in case of regular file or a string in a special format (type:[inode], anon_inode:<file-type>, etc..). As well as in a FreeBSD, opening the file in the Linux fdescfs directory is equivalent to duplicating the corresponding file descriptor. Here we have mutually exclusive requirements: - in case of readlink(2) call fdescfs lookup() method should return VLNK vnode otherwise our kern_readlink() fail with EINVAL error; - in the other calls fdescfs lookup() method should return non VLNK vnode. For what new vnode v_flag VV_READLINK was added, which is set if fdescfs has beed mounted with linrdlnk option an modified kern_readlinkat() to properly handle it. For now For Linux ABI compatibility mount fdescfs volume with linrdlnk option: mount -t fdescfs -o linrdlnk null /compat/linux/dev/fd Reviewed by: kib@ MFC after: 1 week Relnotes: yes
|
#
9fb8c888 |
|
30-Jun-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Define ino64_trunc_error under same conditions as the code which uses the variable. Noted by: bde Sponsored by: The FreeBSD Foundation
|
#
7abe0df2 |
|
09-Jun-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Enhance vfs.ino64_trunc_error sysctl. Provide a new mode "2" which returns a special overflow indicator in the non-representable field instead of the silent truncation (mode "0") or EOVERFLOW (mode "1"). In particular, the typical use of st_ino to detect hard links with mode "2" reports false positives, which might be more suitable for some uses. Discussed with: bde Sponsored by: The FreeBSD Foundation
|
#
3df7ebc4 |
|
05-Jun-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Add sysctl vfs.ino64_trunc_error controlling action on truncating inode number or link count for the ABI compat binaries. Right now, and by default after the change, too large 64bit values are silently truncated to 32 bits. Enabling the knob causes the system to return EOVERFLOW for stat(2) family of compat syscalls when some values cannot be completely represented by the old structures. For getdirentries(2), knob skips the dirents which would cause non-trivial truncation of d_ino. EOVERFLOW error is specified by the X/Open 1996 LFS document ('Adding Support for Arbitrary File Sizes to the Single UNIX Specification'). Based on the discussion with: bde Sponsored by: The FreeBSD Foundation
|
#
69921123 |
|
23-May-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Commit the 64-bit inode project. Extend the ino_t, dev_t, nlink_t types to 64-bit ints. Modify struct dirent layout to add d_off, increase the size of d_fileno to 64-bits, increase the size of d_namlen to 16-bits, and change the required alignment. Increase struct statfs f_mntfromname[] and f_mntonname[] array length MNAMELEN to 1024. ABI breakage is mitigated by providing compatibility using versioned symbols, ingenious use of the existing padding in structures, and by employing other tricks. Unfortunately, not everything can be fixed, especially outside the base system. For instance, third-party APIs which pass struct stat around are broken in backward and forward incompatible ways. Kinfo sysctl MIBs ABI is changed in backward-compatible way, but there is no general mechanism to handle other sysctl MIBS which return structures where the layout has changed. It was considered that the breakage is either in the management interfaces, where we usually allow ABI slip, or is not important. Struct xvnode changed layout, no compat shims are provided. For struct xtty, dev_t tty device member was reduced to uint32_t. It was decided that keeping ABI compat in this case is more useful than reporting 64-bit dev_t, for the sake of pstat. Update note: strictly follow the instructions in UPDATING. Build and install the new kernel with COMPAT_FREEBSD11 option enabled, then reboot, and only then install new world. Credits: The 64-bit inode project, also known as ino64, started life many years ago as a project by Gleb Kurtsou (gleb). Kirk McKusick (mckusick) then picked up and updated the patch, and acted as a flag-waver. Feedback, suggestions, and discussions were carried by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles), and Rick Macklem (rmacklem). Kris Moore (kris) performed an initial ports investigation followed by an exp-run by Antoine Brodin (antoine). Essential and all-embracing testing was done by Peter Holm (pho). The heavy lifting of coordinating all these efforts and bringing the project to completion were done by Konstantin Belousov (kib). Sponsored by: The FreeBSD Foundation (emaste, kib) Differential revision: https://reviews.freebsd.org/D10439
|
#
3e85b721 |
|
16-May-2017 |
Ed Maste <emaste@FreeBSD.org> |
Remove register keyword from sys/ and ANSIfy prototypes A long long time ago the register keyword told the compiler to store the corresponding variable in a CPU register, but it is not relevant for any compiler used in the FreeBSD world today. ANSIfy related prototypes while here. Reviewed by: cem, jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D10193
|
#
8e6be21a |
|
31-Mar-2017 |
Robert Watson <rwatson@FreeBSD.org> |
Audit arguments to posix_fallocate(2) and posix_fadvise(2) system calls. As posix_fadvise() does not lock the vnode argument, don't capture detailed vnode information for the time being. Obtained from: TrustedBSD Project MFC after: 3 weeks Sponsored by: DARPA, AFRL
|
#
fc8bde8f |
|
31-Jan-2017 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Replace calls to sys_truncate() with kern_truncate(). Reviewed by: kib@ MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D9371
|
#
f67d6b5f |
|
29-Jan-2017 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Add kern_lseek() and use it instead of sys_lseek() in various compats. I didn't touch svr4/, there's no point. Reviewed by: ed@, kib@ MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D9366
|
#
ae6b6ef6 |
|
30-Jan-2017 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Replace sys_ftruncate() with kern_ftruncate() in various compats. Reviewed by: kib@ MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D9368
|
#
2f304845 |
|
05-Jan-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not allocate struct statfs on kernel stack. Right now size of the structure is 472 bytes on amd64, which is already large and stack allocations are indesirable. With the ino64 work, MNAMELEN is increased to 1024, which will make it impossible to have struct statfs on the stack. Extracted from: ino64 work by gleb Discussed with: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
607fa849 |
|
05-Jan-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Some style fixes for getfstat(2)-related code. Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
6c4338f2 |
|
04-Jan-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
The callers of kern_getfsstat(UIO_SYSSPACE) expect that *buf always returns memory which must be freed, regardless of the error. Assign NULL to *buf in case we are not going to allocate any memory due to invalid mode. Reported and tested by: pho Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 3 weeks (together with r310638) Differential revision: https://reviews.freebsd.org/D9042
|
#
7ee34a31 |
|
02-Jan-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
There is no need to use temporary statfs buffer for fsid obliteration and prison enforcement. Do it on the caller buffer directly. Besides eliminating memory copies, this change also removes large structure from the kernel stack. Extracted from: ino64 work by gleb Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
b961dc31 |
|
02-Jan-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Style. Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
f2af4041 |
|
02-Jan-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Move common code from kern_statfs() and kern_fstatfs() into a new helper. Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
34ed0c63 |
|
27-Dec-2016 |
John Baldwin <jhb@FreeBSD.org> |
Rename the 'flags' argument to getfsstat() to 'mode' and validate it. This argument is not a bitmask of flags, but only accepts a single value. Fail with EINVAL if an invalid value is passed to 'flag'. Rename the 'flags' argument to getmntinfo(3) to 'mode' as well to match. This is a followup to r308088. Reviewed by: kib MFC after: 1 month
|
#
25e578de |
|
12-Dec-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: use vrefact in getcwd and fchdir
|
#
7359fdcf |
|
01-Nov-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Allow some dotdot lookups in capability mode. If dotdot lookup does not escape from the file descriptor passed as the lookup root, we can allow the component traversal. Track the directories traversed, and check the result of dotdot lookup against the recorded list of the directory vnodes. Dotdot lookups are enabled by sysctl vfs.lookup_cap_dotdot, currently disabled by default until more verification of the approach is done. Disallow non-local filesystems for dotdot, since remote server might conspire with the local process to allow it to escape the namespace. This might be too cautious, provide the knob vfs.lookup_cap_dotdot_nonlocal to override as well. Idea by: rwatson Discussed with: emaste, jonathan, rwatson Reviewed by: mjg (previous version) Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 week Differential revision: https://reviews.freebsd.org/D8110
|
#
53ae7e83 |
|
02-Nov-2016 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Fix getfsstat(2) with MNT_WAIT to not skip filesystems that are in the process of being unmounted. Previously it would skip them, even if the unmount eventually failed eg due to the filesystem being busy. This behaviour broke autounmountd(8) - if you tried to manually unmount a mounted filesystem, using 'automount -u', and the autounmountd attempted to refresh the filesystem list in that very moment, it would conclude that the filesystem got unmounted and not try to unmount it afterwards. Reviewed by: kib@ Tested by: pho@ MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D8030
|
#
6eeff7a7 |
|
28-Oct-2016 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Fix getfsstat(2) handling of flags. The 'flags' argument is an enum, not a bitfield. For the intended usage - being passed either MNT_WAIT, or MNT_NOWAIT - this shouldn't introduce any changes in behaviour. Reviewed by: jhb@ MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D8373
|
#
69a28758 |
|
15-Sep-2016 |
Ed Maste <emaste@FreeBSD.org> |
Renumber license clauses in sys/kern to avoid skipping #3
|
#
93d9ebd8 |
|
15-Aug-2016 |
Ed Schouten <ed@FreeBSD.org> |
Eliminate use of sys_fsync() and sys_fdatasync(). Make the kern_fsync() function public, so that it can be used by other parts of the kernel. Fix up existing consumers to make use of it. Requested by: kib
|
#
295af703 |
|
15-Aug-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Add an implementation of fdatasync(2). The syscall is a trivial wrapper around new VOP_FDATASYNC(), sharing code with fsync(2). For all filesystems, this commit provides the implementation which delegates the work of VOP_FDATASYNC() to VOP_FSYNC(). This is functionally correct but not efficient. This is not yet POSIX-compliant implementation, because it does not ensure that queued AIO requests are completed before returning. Reviewed by: mckusick Discussed with: avg (ZFS), jhb (AIO part) Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D7471
|
#
d7e8cfd6 |
|
15-Jul-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not allow creation of char or block special nodes with VNOVAL dev_t. As was reported on http://seclists.org/oss-sec/2016/q3/68, tmpfs code contains assertion that rdev != VNOVAL. On FreeBSD, there is no other consequences except triggering the assert. To be compatible with systems where device nodes have some significance, reject mknod(2) call with dev == VNOVAL at the syscall level. Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
8ec75c0f |
|
10-Jul-2016 |
Robert Watson <rwatson@FreeBSD.org> |
Audit the file-descriptor number argument for openat(2). Remove a comment about the desirability of auditing the number, as it was in fact in the wrong place (in the common path for open(2) and openat(2), and only the latter accepts a file-descriptor argument). Where other ABIs support openat(2), it may be necessary to do additional argument auditing as it is not performed in kern_openat(9). MFC after: 3 days Sponsored by: DARPA, AFRL
|
#
34e05ebe |
|
31-May-2016 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Fix kernel stack disclosures in the Linux and 4.3BSD compat layers. Submitted by: CTurt Security: SA-16:20 Security: SA-16:21
|
#
399e8c17 |
|
09-Mar-2016 |
John Baldwin <jhb@FreeBSD.org> |
Simplify AIO initialization now that it is standard. - Mark AIO system calls as STD and remove the helpers to dynamically register them. - Use COMPAT6 for the old system calls with the older sigevent instead of an 'o' prefix. - Simplify the POSIX configuration to note that AIO is always available. - Handle AIO in the default VOP_PATHCONF instead of special casing it in the pathconf() system call. fpathconf() is still hackish. - Remove freebsd32_aio_cancel() as it just called the native one directly. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D5589
|
#
0acf5d0b |
|
25-Feb-2016 |
Mark Johnston <markj@FreeBSD.org> |
Improve error handling for posix_fallocate(2) and posix_fadvise(2). - Set td_errno so that ktrace and dtrace can obtain the syscall error number in the usual way. - Pass negative error numbers directly to the syscall layer, as they're not intended to be returned to userland. Reviewed by: kib Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D5425
|
#
b00b4590 |
|
06-Feb-2016 |
Kirk McKusick <mckusick@FreeBSD.org> |
Clarify a comment in kern_openat() about the use of falloc_noinstall(). Suggested by: Steve Jacobson
|
#
a8723fb8 |
|
20-Nov-2015 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
The freebsd4_getfsstat() was broken in r281551 to always return 0 on success. All versions of getfsstat(3) are supposed to return the number of [o]statfs structs in the array that was copied out. Also fix missing bounds checking and signed comparison of unsigned types. Submitted by: bde@ MFC after: 1 month Sponsored by: The FreeBSD Foundation
|
#
403ec61c |
|
03-Oct-2015 |
Mark Johnston <markj@FreeBSD.org> |
Revert r288628 and instead fix a discrepancy between the posix_fadvise(2) man page and POSIX: posix_fadvise(2) returns an error number on failure. Reported by: jilles MFC after: 1 week
|
#
a7713f76 |
|
03-Oct-2015 |
Mark Johnston <markj@FreeBSD.org> |
The return value of posix_fadvise(2) is just an error status, so sys_posix_fadvise() should simply return the errno (or 0) to syscallenter() rather than setting a return value. MFC after: 1 week
|
#
3138cd36 |
|
30-Sep-2015 |
Mark Johnston <markj@FreeBSD.org> |
As a step towards the elimination of PG_CACHED pages, rework the handling of POSIX_FADV_DONTNEED so that it causes the backing pages to be moved to the head of the inactive queue instead of being cached. This affects the implementation of POSIX_FADV_NOREUSE as well, since it works by applying POSIX_FADV_DONTNEED to file ranges after they have been read or written. At that point the corresponding buffers may still be dirty, so the previous implementation would coalesce successive ranges and apply POSIX_FADV_DONTNEED to the result, ensuring that pages backing the dirty buffers would eventually be cached. To preserve this behaviour in an efficient manner, this change adds a new buf flag, B_NOREUSE, which causes the pages backing a VMIO buf to be placed at the head of the inactive queue when the buf is released. POSIX_FADV_NOREUSE then works by setting this flag in bufs that underlie the specified range. Reviewed by: alc, kib Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3726
|
#
2f2f522b |
|
27-Sep-2015 |
Andriy Gapon <avg@FreeBSD.org> |
save some bytes by using more concise SDT_PROBE<n> instead of SDT_PROBE SDT_PROBE requires 5 parameters whereas SDT_PROBE<n> requires n parameters where n is typically smaller than 5. Perhaps SDT_PROBE should be made a private implementation detail. MFC after: 20 days
|
#
bc1ace0b |
|
27-Aug-2015 |
Ed Schouten <ed@FreeBSD.org> |
Decompose linkat()/renameat() rights to source and target. To make it easier to understand how Capsicum interacts with linkat() and renameat(), rename the rights to CAP_{LINK,RENAME}AT_{SOURCE,TARGET}. This also addresses a shortcoming in Capsicum, where it isn't possible to disable linking to files stored in a directory. Creating hardlinks essentially makes it possible to access files with additional rights. Reviewed by: rwatson, wblock Differential Revision: https://reviews.freebsd.org/D3411
|
#
97fc0277 |
|
11-Jul-2015 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Try to unbreak the build after r285390 removing the obsolete static declaration.
|
#
f0725a8e |
|
11-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
Move chdir/chroot-related fdp manipulation to kern_descrip.c Prefix exported functions with pwd_. Deduplicate some code by adding a helper for setting fd_cdir. Reviewed by: kib
|
#
4da8456f |
|
16-Jun-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
Replace struct filedesc argument in getvnode with struct thread This is is a step towards removal of spurious arguments.
|
#
c3293b83 |
|
18-May-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
Tidy up sys_umask a little bit Consistently use saved fdp pointer as it cannot change. If it could change the code would be already incorrect. No functional changes.
|
#
0538aafc |
|
18-Apr-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
The lseek(2), mmap(2), truncate(2), ftruncate(2), pread(2), and pwrite(2) syscalls are wrapped to provide compatibility with pre-7.x kernels which required padding before the off_t parameter. The fcntl(2) contains compatibility code to handle kernels before the struct flock was changed during the 8.x CURRENT development. The shims were reasonable to allow easier revert to the older kernel at that time. Now, two or three major releases later, shims do not serve any purpose. Such old kernels cannot handle current libc, so revert the compatibility code. Make padded syscalls support conditional under the COMPAT6 config option. For COMPAT32, the syscalls were under COMPAT6 already. Remove WITHOUT_SYSCALL_COMPAT build option, which only purpose was to (partially) disable the removed shims. Reviewed by: jhb, imp (previous versions) Discussed with: peter Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
1c73bcab |
|
15-Apr-2015 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Rewrite linprocfs_domtab() as a wrapper around kern_getfsstat(). This adds missing jail and MAC checks. Differential Revision: https://reviews.freebsd.org/D2193 Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation
|
#
78d75aba |
|
04-Apr-2015 |
Jilles Tjoelker <jilles@FreeBSD.org> |
utimensat: Correct Capsicum required capability rights.
|
#
b7a39e9e |
|
17-Feb-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: simplify fget_unlocked & friends Introduce fget_fcntl which performs appropriate checks when needed. This removes a branch from fget_unlocked. Introduce fget_mmap dealing with cap_rights_to_vmprot conversion. This removes a branch from _fget. Modify fget_unlocked to pass sequence counter to interested callers so that they can perform their own checks and make sure the result was otained from stable & current state. Reviewed by: silence on -hackers
|
#
2205e0d1 |
|
23-Jan-2015 |
Jilles Tjoelker <jilles@FreeBSD.org> |
Add futimens and utimensat system calls. The core kernel part is patch file utimes.2008.4.diff from pluknet@FreeBSD.org. I updated the code for API changes, added the manual page and added compatibility code for old kernels. There is also audit and Capsicum support. A new UTIME_* constant might allow setting birthtimes in future. Differential Revision: https://reviews.freebsd.org/D1426 Submitted by: pluknet (partially) Reviewed by: delphij, pluknet, rwatson Relnotes: yes
|
#
6c21f6ed |
|
18-Dec-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
The VOP_LOOKUP() implementations for CREATE op do not put the name into namecache, to avoid cache trashing when doing large operations. E.g., tar archive extraction is not usually followed by access to many of the files created. Right now, each VOP_LOOKUP() implementation explicitely knowns about this quirk and tests for both MAKEENTRY flag presence and op != CREATE to make the call to cache_enter(). Centralize the handling of the quirk into VFS, by deciding to cache only by MAKEENTRY flag in VOP. VFS now sets NOCACHE flag for CREATE namei() calls. Note that the change in semantic is backward-compatible and could be merged to the stable branch, and is compatible with non-changed third-party filesystems which correctly handle MAKEENTRY. Suggested by: Chris Torek <torek@pi-coral.com> Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
db1ec81e |
|
13-Nov-2014 |
Jung-uk Kim <jkim@FreeBSD.org> |
Correct a typo to fix chown(2). It was broken since r274476. Pointy hat to: kib X-MFC-With: r274476
|
#
6e646651 |
|
13-Nov-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove the no-at variants of the kern_xx() syscall helpers. E.g., we have both kern_open() and kern_openat(); change the callers to use kern_openat(). This removes one (sometimes two) levels of indirection and consolidates arguments checks. Reviewed by: mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
f2c1a52a |
|
13-Nov-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove fossil. It has been present in 4.4Lite2, but its use was removed for some time. Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
389a25c7 |
|
12-Nov-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
For posix_fallocate(2) and posix_fadvise(2), return ESPIPE when underlying file does not have DFLAG_SEEKABLE set [1]. For posix_fallocate(2), simplify error handling logic. Do return when fp is not yet referenced. Noted by: bde [1] Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
a39d200b |
|
21-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Reduce nesting in vn_access. No functional changes.
|
#
eac96781 |
|
21-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Avoid crdup when possible in kern_accessat. While here tidy up a little.
|
#
2c6fbcbe |
|
25-Sep-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
In kern_linkat() and kern_renameat(), do not call namei(9) while holding a write reference on the filesystem. Try to get write reference in unblocked way after all vnodes are resolved; if failed, drop all locks and retry after waiting for suspension end. The VFS_UNMOUNT() methods for UFS and tmpfs try to establish suspension on unmount, while covered vnode is locked by VFS, which prevents namei() from stepping over the mount point. The thread doing namei() sleeps on the covered vnode lock, owning the write ref. Reported by: bdrewery Tested by: bdrewery (previous version), pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
257597a4 |
|
15-Sep-2014 |
Enji Cooper <ngie@FreeBSD.org> |
Validate the mode argument in access, eaccess, and faccessat for optional POSIX compliance and to improve compatibility with Linux and NetBSD The issue was identified with lib/libc/sys/t_access:access_inval from NetBSD Update the manpage accordingly PR: 181155 Reviewed by: jilles (code), jmmv (code), wblock (manpage), wollman (code) MFC after: 4 weeks Phabric: D678 (code), D786 (manpage) Sponsored by: EMC / Isilon Storage Division
|
#
65589a29 |
|
16-Jul-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Check for the cross-device cross-link attempt in the VFS, instead of forcing filesystem VOP_LINK() methods to repeat the code. In tmpfs_link(), remove redundand check for the type of the source, already done by VFS. Note that NFS server already performs this check before calling VOP_LINK(). Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
57ef02ff |
|
14-Jul-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
In kern_linkat(), avoid passing doomed vnode to the VOP. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
adf87ab0 |
|
21-Jun-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: replace fd_nfiles with fd_lastfile where appropriate fd_lastfile is guaranteed to be the biggest open fd, so when the intent is to iterate over active fds or lookup one, there is no point in looking beyond that limit. Few places are left unpatched for now. MFC after: 1 week
|
#
4a144410 |
|
16-Mar-2014 |
Robert Watson <rwatson@FreeBSD.org> |
Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. MFC after: 3 weeks
|
#
9d2437a6 |
|
12-Mar-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
The auio structure is only initialized when the vnode is symlink, avoid reading from it otherwise. Submitted by: Conrad Meyer <cemeyer@uw.edu> MFC after: 1 week
|
#
49d39308 |
|
30-Jan-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
The posix_madvise(3) and posix_fadvise(2) should return error on failure, same as posix_fallocate(2). Noted by: Bob Bishop <rb@gid.co.uk> Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
2852de04 |
|
23-Jan-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
The posix_fallocate(2) syscall should return error number on error, without modifying errno. Reported and tested by: Gennady Proskurin <gpr@mail.ru> Reviewed by: mdf PR: standards/186028 Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
d9fae5ab |
|
26-Nov-2013 |
Andriy Gapon <avg@FreeBSD.org> |
dtrace sdt: remove the ugly sname parameter of SDT_PROBE_DEFINE In its stead use the Solaris / illumos approach of emulating '-' (dash) in probe names with '__' (two consecutive underscores). Reviewed by: markj MFC after: 3 weeks
|
#
54366c0b |
|
25-Nov-2013 |
Attilio Rao <attilio@FreeBSD.org> |
- For kernel compiled only with KDTRACE_HOOKS and not any lock debugging option, unbreak the lock tracing release semantic by embedding calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined version of the releasing functions for mutex, rwlock and sxlock. Failing to do so skips the lockstat_probe_func invokation for unlocking. - As part of the LOCKSTAT support is inlined in mutex operation, for kernel compiled without lock debugging options, potentially every consumer must be compiled including opt_kdtrace.h. Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES is linked there and it is only used as a compile-time stub [0]. [0] immediately shows some new bug as DTRACE-derived support for debug in sfxge is broken and it was never really tested. As it was not including correctly opt_kdtrace.h before it was never enabled so it was kept broken for a while. Fix this by using a protection stub, leaving sfxge driver authors the responsibility for fixing it appropriately [1]. Sponsored by: EMC / Isilon storage division Discussed with: rstone [0] Reported by: rstone [1] Discussed with: philip
|
#
44fcd367 |
|
05-Sep-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Correct the logic broken in my last commit. Reported by: tijl
|
#
a686a7be |
|
04-Sep-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Style fixes.
|
#
7008be5b |
|
04-Sep-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...); bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation
|
#
c0a46535 |
|
21-Aug-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Make the seek a method of the struct fileops. Tested by: pho Sponsored by: The FreeBSD Foundation
|
#
7b77e1fe |
|
14-Aug-2013 |
Mark Johnston <markj@FreeBSD.org> |
Specify SDT probe argument types in the probe definition itself rather than using SDT_PROBE_ARGTYPE(). This will make it easy to extend the SDT(9) API to allow probes with dynamically-translated types. There is no functional change. MFC after: 2 weeks
|
#
0fc6daa7 |
|
11-May-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
- Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). The null_hashget() obtains the reference on the nullfs vnode, which must be dropped. - Fix a wart which existed from the introduction of the nullfs caching, do not unlock lower vnode in the nullfs_reclaim_lowervp(). It should be innocent, but now it is also formally safe. Inform the nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on nullfs inode. - Add a callback to the upper filesystems for the lower vnode unlinking. When inactivating a nullfs vnode, check if the lower vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC on the lower vnode, and reclaim upper vnode if so. This allows nullfs to purge cached vnodes for the unlinked lower vnode, avoiding excessive caching. Reported by: G??ran L??wkrantz <goran.lowkrantz@ismobile.com> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
051a23d4 |
|
22-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Constify local path variable for chflagsat(). - Use correct format characters (%lx) for u_long. This fixes the build broken in r248599.
|
#
e948704e |
|
21-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Implement chflagsat(2) system call, similar to fchmodat(2), but operates on file flags. Reviewed by: kib, jilles Sponsored by: The FreeBSD Foundation
|
#
b4b2596b |
|
21-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Make 'flags' argument to chflags(2), fchflags(2) and lchflags(2) of type u_long. Before this change it was of type int for syscalls, but prototypes in sys/stat.h and documentation for chflags(2) and fchflags(2) (but not for lchflags(2)) stated that it was u_long. Now some related functions use u_long type for flags (strtofflags(3), fflagstostr(3)). - Make path argument of type 'const char *' for consistency. Discussed on: arch Sponsored by: The FreeBSD Foundation
|
#
943c3bb9 |
|
16-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Require CAP_SEEK if both O_APPEND and O_TRUNC flags are absent. In other words we don't require CAP_SEEK if either O_APPEND or O_TRUNC flag is given, because O_APPEND doesn't allow to overwrite existing data and O_TRUNC requires CAP_FTRUNCATE already. Sponsored by: The FreeBSD Foundation
|
#
d6b2bd0b |
|
16-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Style: Whitespace fixes.
|
#
1ea67dd9 |
|
16-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Style: Remove redundant space.
|
#
89f6b863 |
|
08-Mar-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
#
6d4e99aa |
|
02-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
If the target file already exists, check for the CAP_UNLINKAT capabiity right on the target directory descriptor, but only if this is renameat(2) and real target directory descriptor is given (not AT_FDCWD). Without this fix regular rename(2) fails if the target file already exists. Reported by: Michael Butler <imb@protected-networks.net> Reported by: Larry Rosenman <ler@lerctr.org> Sponsored by: The FreeBSD Foundation
|
#
2609222a |
|
01-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Merge Capsicum overhaul: - Capability is no longer separate descriptor type. Now every descriptor has set of its own capability rights. - The cap_new(2) system call is left, but it is no longer documented and should not be used in new code. - The new syscall cap_rights_limit(2) should be used instead of cap_new(2), which limits capability rights of the given descriptor without creating a new one. - The cap_getrights(2) syscall is renamed to cap_rights_get(2). - If CAP_IOCTL capability right is present we can further reduce allowed ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed ioctls can be retrived with cap_ioctls_get(2) syscall. - If CAP_FCNTL capability right is present we can further reduce fcntls that can be used with the new cap_fcntls_limit(2) syscall and retrive them with cap_fcntls_get(2). - To support ioctl and fcntl white-listing the filedesc structure was heavly modified. - The audit subsystem, kdump and procstat tools were updated to recognize new syscalls. - Capability rights were revised and eventhough I tried hard to provide backward API and ABI compatibility there are some incompatible changes that are described in detail below: CAP_CREATE old behaviour: - Allow for openat(2)+O_CREAT. - Allow for linkat(2). - Allow for symlinkat(2). CAP_CREATE new behaviour: - Allow for openat(2)+O_CREAT. Added CAP_LINKAT: - Allow for linkat(2). ABI: Reuses CAP_RMDIR bit. - Allow to be target for renameat(2). Added CAP_SYMLINKAT: - Allow for symlinkat(2). Removed CAP_DELETE. Old behaviour: - Allow for unlinkat(2) when removing non-directory object. - Allow to be source for renameat(2). Removed CAP_RMDIR. Old behaviour: - Allow for unlinkat(2) when removing directory. Added CAP_RENAMEAT: - Required for source directory for the renameat(2) syscall. Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR): - Allow for unlinkat(2) on any object. - Required if target of renameat(2) exists and will be removed by this call. Removed CAP_MAPEXEC. CAP_MMAP old behaviour: - Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and PROT_WRITE. CAP_MMAP new behaviour: - Allow for mmap(2)+PROT_NONE. Added CAP_MMAP_R: - Allow for mmap(PROT_READ). Added CAP_MMAP_W: - Allow for mmap(PROT_WRITE). Added CAP_MMAP_X: - Allow for mmap(PROT_EXEC). Added CAP_MMAP_RW: - Allow for mmap(PROT_READ | PROT_WRITE). Added CAP_MMAP_RX: - Allow for mmap(PROT_READ | PROT_EXEC). Added CAP_MMAP_WX: - Allow for mmap(PROT_WRITE | PROT_EXEC). Added CAP_MMAP_RWX: - Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC). Renamed CAP_MKDIR to CAP_MKDIRAT. Renamed CAP_MKFIFO to CAP_MKFIFOAT. Renamed CAP_MKNODE to CAP_MKNODEAT. CAP_READ old behaviour: - Allow pread(2). - Disallow read(2), readv(2) (if there is no CAP_SEEK). CAP_READ new behaviour: - Allow read(2), readv(2). - Disallow pread(2) (CAP_SEEK was also required). CAP_WRITE old behaviour: - Allow pwrite(2). - Disallow write(2), writev(2) (if there is no CAP_SEEK). CAP_WRITE new behaviour: - Allow write(2), writev(2). - Disallow pwrite(2) (CAP_SEEK was also required). Added convinient defines: #define CAP_PREAD (CAP_SEEK | CAP_READ) #define CAP_PWRITE (CAP_SEEK | CAP_WRITE) #define CAP_MMAP_R (CAP_MMAP | CAP_SEEK | CAP_READ) #define CAP_MMAP_W (CAP_MMAP | CAP_SEEK | CAP_WRITE) #define CAP_MMAP_X (CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL) #define CAP_MMAP_RW (CAP_MMAP_R | CAP_MMAP_W) #define CAP_MMAP_RX (CAP_MMAP_R | CAP_MMAP_X) #define CAP_MMAP_WX (CAP_MMAP_W | CAP_MMAP_X) #define CAP_MMAP_RWX (CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X) #define CAP_RECV CAP_READ #define CAP_SEND CAP_WRITE #define CAP_SOCK_CLIENT \ (CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \ CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN) #define CAP_SOCK_SERVER \ (CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \ CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \ CAP_SETSOCKOPT | CAP_SHUTDOWN) Added defines for backward API compatibility: #define CAP_MAPEXEC CAP_MMAP_X #define CAP_DELETE CAP_UNLINKAT #define CAP_MKDIR CAP_MKDIRAT #define CAP_RMDIR CAP_UNLINKAT #define CAP_MKFIFO CAP_MKFIFOAT #define CAP_MKNOD CAP_MKNODAT #define CAP_SOCK_ALL (CAP_SOCK_CLIENT | CAP_SOCK_SERVER) Sponsored by: The FreeBSD Foundation Reviewed by: Christoph Mallon <christoph.mallon@gmx.de> Many aspects discussed with: rwatson, benl, jonathan ABI compatibility discussed with: kib
|
#
f4d0191b |
|
01-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Reduce lock scope a little.
|
#
f0ad2ecb |
|
17-Feb-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Style.
|
#
8e1d51ab |
|
17-Feb-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Require CAP_FSYNC capability right when opening a file with O_SYNC or O_FSYNC flags. - While here simplify check for locking flags. Sponsored by: The FreeBSD Foundation
|
#
2ca49983 |
|
07-Feb-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Stop translating the ERESTART error from the open(2) into EINTR. Posix requires that open(2) is restartable for SA_RESTART. For non-posix objects, in particular, devfs nodes, still disable automatic restart of the opens. The open call to a driver could have significant side effects for the hardware. Noted and reviewed by: jilles Discussed with: bde MFC after: 2 weeks
|
#
4bbe7b0c |
|
31-Jan-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Now that MPSAFE flag is gone, we can arrange code a bit better.
|
#
b108953c |
|
31-Jan-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Remove leftover label after Giant removal from VFS.
|
#
d7259a57 |
|
22-Oct-2012 |
Ed Schouten <ed@FreeBSD.org> |
Remove unused `vfslocked' variable. I have no idea what this `vfslocked' thing means. I wonder how it ended up here.
|
#
5050aa86 |
|
22-Oct-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
|
#
36c6f3aa |
|
15-Oct-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Acquire the rangelock for truncate(2) as well. Reported and reviewed by: avg Tested by: pho MFC after: 1 week
|
#
55711729 |
|
30-Sep-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Enforce CAP_MKFIFO on mkfifoat(2), not on mknodat(2). Without this change mkfifoat(2) was not restricted. - Introduce CAP_MKNOD and enforce it on mknodat(2). Sponsored by: FreeBSD Foundation MFC after: 2 weeks
|
#
5c3e5c7f |
|
25-Sep-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Require CAP_DELETE on directory descriptor for unlinkat(2). Sponsored by: FreeBSD Foundation MFC after: 2 weeks
|
#
cffcbad2 |
|
25-Sep-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Require CAP_CREATE on directory descriptor for symlinkat(2). Sponsored by: FreeBSD Foundation MFC after: 2 weeks
|
#
d2e166e6 |
|
25-Sep-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Require CAP_CREATE on directory descriptor for linkat(2). Sponsored by: FreeBSD Foundation MFC after: 2 weeks
|
#
1159429d |
|
25-Sep-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
O_EXEC flag is not part of the O_ACCMODE mask, check it separately. If O_EXEC is provided don't require CAP_READ/CAP_WRITE, as O_EXEC is mutually exclusive to O_RDONLY/O_WRONLY/O_RDWR. Without this change CAP_FEXECVE capability right is not enforced. Sponsored by: FreeBSD Foundation MFC after: 3 days
|
#
e838f09c |
|
31-Jul-2012 |
John Baldwin <jhb@FreeBSD.org> |
Reorder the managament of advisory locks on open files so that the advisory lock is obtained before the write count is increased during open() and the lock is released after the write count is decreased during close(). The first change closes a race where an open() that will block with O_SHLOCK or O_EXLOCK can increase the write count while it waits. If the process holding the current lock on the file then tries to call exec() on the file it has locked, it can fail with ETXTBUSY even though the advisory lock is preventing other threads from succesfully completeing a writable open(). The second change closes a race where a read-only open() with O_SHLOCK or O_EXLOCK may return successfully while the write count is non-zero due to another descriptor that had the advisory lock and was blocking the open() still being in the process of closing. If the process that completed the open() then attempts to call exec() on the file it locked, it can fail with ETXTBUSY even though the other process that held a write lock has closed the file and released the lock. Reviewed by: kib MFC after: 1 month
|
#
c5c1199c |
|
02-Jul-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Extend the KPI to lock and unlock f_offset member of struct file. It now fully encapsulates all accesses to f_offset, and extends f_offset locking to other consumers that need it, in particular, to lseek() and variants of getdirentries(). Ensure that on 32bit architectures f_offset, which is 64bit quantity, always read and written under the mtxpool protection. This fixes apparently easy to trigger race when parallel lseek()s or lseek() and read/write could destroy file offset. The already broken ABI emulations, including iBCS and SysV, are not converted (yet). Tested by: pho No objections from: jhb MFC after: 3 weeks
|
#
cd4ecf3c |
|
19-Jun-2012 |
John Baldwin <jhb@FreeBSD.org> |
Further refine the implementation of POSIX_FADV_NOREUSE. First, extend the changes in r230782 to better handle the common case of using NOREUSE with sequential reads. A NOREUSE file descriptor will now track the last implicit DONTNEED request it made as a result of a NOREUSE read. If a subsequent NOREUSE read is adjacent to the previous range, it will apply the DONTNEED request to the entire range of both the previous read and the current read. The effect is that each read of a file accessed sequentially will apply the DONTNEED request to the entire range that has been read. This allows NOREUSE to properly handle misaligned reads by flushing each buffer to cache once it has been completely read. Second, apply the same changes made to read(2) by r230782 and this change to writes. This provides much better performance in the sequential write case as it allows writes to still be clustered. It also provides much better performance for misaligned writes. It does mean that NOREUSE will be generally ineffective for non-sequential writes as the current implementation relies on a future NOREUSE write's implicit DONTNEED request to flush the dirty buffer from the current write. MFC after: 2 weeks
|
#
7080f124 |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Now that dupfdopen() doesn't depend on finstall() being called earlier, indx will never be -1 on error, as none of dupfdopen(), finstall() and kern_capwrap() modifies it on error, but what is more important none of those functions install and leave file at indx descriptor on error. Leave an assert to prove my words. MFC after: 1 month
|
#
3812dcd3 |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Allocate descriptor number in dupfdopen() itself instead of depending on the caller using finstall(). This saves us the filedesc lock/unlock cycle, fhold()/fdrop() cycle and closes a race between finstall() and dupfdopen(). MFC after: 1 month
|
#
7f35af01 |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Remove nfp variable that is not really needed. - Update comment. - Style nits. MFC after: 1 month
|
#
c64dd3ba |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Remove duplicated code. MFC after: 1 month
|
#
81424ab7 |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Add missing {. MFC after: 1 month
|
#
85c1550d |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Style. MFC after: 1 month
|
#
baf94622 |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
There is no need to set td->td_retval[0] to -1 on error. Confirmed by: jhb MFC after: 1 month
|
#
f3cd9805 |
|
11-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Style fixes and simplifications. MFC after: 1 month
|
#
7ac1b61a |
|
08-Jun-2012 |
John Baldwin <jhb@FreeBSD.org> |
Split the second half of vn_open_cred() (after a vnode has been found via a lookup or created via VOP_CREATE()) into a new vn_open_vnode() function and use this function in fhopen() instead of duplicating code from vn_open_cred() directly. Tested by: pho Reviewed by: kib MFC after: 2 weeks
|
#
76dcec5d |
|
24-May-2012 |
Gleb Kurtsou <gleb@FreeBSD.org> |
Add kern_fhstat(), adjust sys_fhstat() to use it. Extend kern_getdirentries() to accept uio segflag and optionally return buffer residue. Sponsored by: Google Summer of Code 2011
|
#
dd952f80 |
|
20-Apr-2012 |
Jaakko Heinonen <jh@FreeBSD.org> |
The value of flags matching VNOVAL can't be supported. Return EOPNOTSUPP from setfflags() in this case. This fixes the return value of chflags(path, -1). Discussed with: bde MFC after: 2 weeks
|
#
39e77c4c |
|
09-Mar-2012 |
Peter Holm <pho@FreeBSD.org> |
Perform the parameter validation before assigning it to a signed int variable. This fixes the problem seen with readdir(3) fuzzing. Submitted by: bde MFC after: 1 week
|
#
ffae9d4d |
|
08-Mar-2012 |
Peter Holm <pho@FreeBSD.org> |
Free up allocated memory used by posix_fadvise(2).
|
#
b47f6241 |
|
08-Mar-2012 |
John Baldwin <jhb@FreeBSD.org> |
Add KTR_VFS traces to track modifications to a vnode's writecount.
|
#
526d0bd5 |
|
20-Feb-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix found places where uio_resid is truncated to int. Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month
|
#
c480f781 |
|
06-Feb-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks
|
#
3ab01603 |
|
08-Jan-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Avoid LOR between vfs_busy() lock and covered vnode lock on quotaon(). The vfs_busy() is after covered vnode lock in the global lock order, but since quotaon() does recursive VFS call to open quota file, we usually end up locking covered vnode after mp is busied in sys_quotactl(). Change the interface of VFS_QUOTACTL(), requiring that mp was unbusied by fs code, and do not try to pick up vfs_busy() reference in ufs quotaon, esp. if vfs_busy cannot succeed due to unmount being performed. Reported and tested by: pho MFC after: 1 week
|
#
f427c78b |
|
16-Dec-2011 |
John Baldwin <jhb@FreeBSD.org> |
Fire a kevent if necessary after seeking on a regular file. This fixes a case where a kevent would not fire on a regular file if an application read to EOF and then seeked backwards into the file. Reviewed by: kib MFC after: 2 weeks
|
#
3eb9ab52 |
|
12-Dec-2011 |
Eitan Adler <eadler@FreeBSD.org> |
Document a large number of currently undocumented sysctls. While here fix some style(9) issues and reduce redundancy. PR: kern/155491 PR: kern/155490 PR: kern/155489 Submitted by: Galimov Albert <wtfcrap@mail.ru> Approved by: bde Reviewed by: jhb MFC after: 1 week
|
#
561984be |
|
24-Nov-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix a race between getvnode() dereferencing half-constructed file and dupfdopen(). Reported and tested by: pho MFC after: 3 days
|
#
d3a993d4 |
|
18-Nov-2011 |
Ed Schouten <ed@FreeBSD.org> |
Improve *access*() parameter name consistency. The current code mixes the use of `flags' and `mode'. This is a bit confusing, since the faccessat() function as a `flag' parameter to store the AT_ flag. Make this less confusing by using the same name as used in the POSIX specification -- `amode'.
|
#
7edec621 |
|
14-Nov-2011 |
John Baldwin <jhb@FreeBSD.org> |
- Split out a kern_posix_fadvise() from the posix_fadvise() system call so it can be used by in-kernel consumers. - Make kern_posix_fallocate() public. - Use kern_posix_fadvise() and kern_posix_fallocate() to implement the freebsd32 wrappers for the two system calls.
|
#
936c09ac |
|
03-Nov-2011 |
John Baldwin <jhb@FreeBSD.org> |
Add the posix_fadvise(2) system call. It is somewhat similar to madvise(2) except that it operates on a file descriptor instead of a memory region. It is currently only supported on regular files. Just as with madvise(2), the advice given to posix_fadvise(2) can be divided into two types. The first type provide hints about data access patterns and are used in the file read and write routines to modify the I/O flags passed down to VOP_READ() and VOP_WRITE(). These modes are thus filesystem independent. Note that to ease implementation (and since this API is only advisory anyway), only a single non-normal range is allowed per file descriptor. The second type of hints are used to hint to the OS that data will or will not be used. These hints are implemented via a new VOP_ADVISE(). A default implementation is provided which does nothing for the WILLNEED request and attempts to move any clean pages to the cache page queue for the DONTNEED request. This latter case required two other changes. First, a new V_CLEANONLY flag was added to vinvalbuf(). This requests vinvalbuf() to only flush clean buffers for the vnode from the buffer cache and to not remove any backing pages from the vnode. This is used to ensure clean pages are not wired into the buffer cache before attempting to move them to the cache page queue. The second change adds a new vm_object_page_cache() method. This method is somewhat similar to vm_object_page_remove() except that instead of freeing each page in the specified range, it attempts to move clean pages to the cache queue if possible. To preserve the ABI of struct file, the f_cdevpriv pointer is now reused in a union to point to the currently active advice region if one is present for regular files. Reviewed by: jilles, kib, arch@ Approved by: re (kib) MFC after: 1 month
|
#
8451d0dd |
|
16-Sep-2011 |
Kip Macy <kmacy@FreeBSD.org> |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
|
#
9c00bb91 |
|
16-Aug-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Add the fo_chown and fo_chmod methods to struct fileops and use them to implement fchown(2) and fchmod(2) support for several file types that previously lacked it. Add MAC entries for chown/chmod done on posix shared memory and (old) in-kernel posix semaphores. Based on the submission by: glebius Reviewed by: rwatson Approved by: re (bz)
|
#
fd9a5f73 |
|
13-Aug-2011 |
Robert Watson <rwatson@FreeBSD.org> |
When falloc() was broken into separate falloc_noinstall() and finstall(), a bug was introduced in kern_openat() such that the error from the vnode open operation was overwritten before it was passed as an argument to dupfdopen(). This broke operations on /dev/{stdin,stdout,stderr}. Fix by preserving the original error number across finstall() so that it is still available. Approved by: re (kib) Reported by: cognet
|
#
69d377fe |
|
13-Aug-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Allow Capsicum capabilities to delegate constrained access to file system subtrees to sandboxed processes. - Use of absolute paths and '..' are limited in capability mode. - Use of absolute paths and '..' are limited when looking up relative to a capability. - When a name lookup is performed, identify what operation is to be performed (such as CAP_MKDIR) as well as check for CAP_LOOKUP. With these constraints, openat() and friends are now safe in capability mode, and can then be used by code such as the capability-mode runtime linker. Approved by: re (bz), mentor (rwatson) Sponsored by: Google Inc
|
#
d6d2cfa2 |
|
11-Aug-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Only call fdclose() on successfully-opened FDs. Since kern_openat() now uses falloc_noinstall() and finstall() separately, there are cases where we could get to cleanup code without ever creating a file descriptor. In those cases, we should not call fdclose() on FD -1. Approved by: re (kib), mentor (rwatson) Sponsored by: Google Inc
|
#
a9d2f8d8 |
|
10-Aug-2011 |
Robert Watson <rwatson@FreeBSD.org> |
Second-to-last commit implementing Capsicum capabilities in the FreeBSD kernel for FreeBSD 9.0: Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op. Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions. In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit. Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent. Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
|
#
694a586a |
|
21-May-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add a lock flags argument to the VFS_FHTOVP() file system method, so that callers can indicate the minimum vnode locking requirement. This will allow some file systems to choose to return a LK_SHARED locked vnode when LK_SHARED is specified for the flags argument. This patch only adds the flag. It does not change any file system to use it and all callers specify LK_EXCLUSIVE, so file system semantics are not changed. Reviewed by: kib
|
#
1ce4508f |
|
19-Apr-2011 |
Matthew D Fleming <mdf@FreeBSD.org> |
Allow VOP_ALLOCATE to be iterative, and have kern_posix_fallocate(9) drive looping and potentially yielding. Requested by: kib
|
#
5d253e41 |
|
18-Apr-2011 |
Matthew D Fleming <mdf@FreeBSD.org> |
Fix a copy/paste whitespace error.
|
#
d91f88f7 |
|
18-Apr-2011 |
Matthew D Fleming <mdf@FreeBSD.org> |
Add the posix_fallocate(2) syscall. The default implementation in vop_stdallocate() is filesystem agnostic and will run as slow as a read/write loop in userspace; however, it serves to correctly implement the functionality for filesystems that do not implement a VOP_ALLOCATE. Note that __FreeBSD_version was already bumped today to 900036 for any ports which would like to use this function. Also reserve space in the syscall table for posix_fadvise(2). Reviewed by: -arch (previous version)
|
#
1fe80828 |
|
01-Apr-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
After the r219999 is merged to stable/8, rename fallocf(9) to falloc(9) and remove the falloc() version that lacks flag argument. This is done to reduce the KPI bloat. Requested by: jhb X-MFC-note: do not
|
#
7332c129 |
|
01-Apr-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Add support for executing the FreeBSD 1/i386 a.out binaries on amd64. In particular: - implement compat shims for old stat(2) variants and ogetdirentries(2); - implement delivery of signals with ancient stack frame layout and corresponding sigreturn(2); - implement old getpagesize(2); - provide a user-mode trampoline and LDT call gate for lcall $7,$0; - port a.out image activator and connect it to the build as a module on amd64. The changes are hidden under COMPAT_43. MFC after: 1 month
|
#
246d35ec |
|
25-Mar-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Add O_CLOEXEC flag to open(2) and fhopen(2). The new function fallocf(9), that is renamed falloc(9) with added flag argument, is provided to facilitate the merge to stable branch. Reviewed by: jhb MFC after: 1 week
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
79856499 |
|
22-Aug-2010 |
Rui Paulo <rpaulo@FreeBSD.org> |
Add an extra comment to the SDT probes definition. This allows us to get use '-' in probe names, matching the probe names in Solaris.[1] Add userland SDT probes definitions to sys/sdt.h. Sponsored by: The FreeBSD Foundation Discussed with: rwaston [1]
|
#
aa81ae08 |
|
06-Jul-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
In revoke(), verify that VCHR vnode indeed belongs to devfs. Found and tested by: pho MFC after: 1 week
|
#
f4629420 |
|
01-Jun-2010 |
Robert Watson <rwatson@FreeBSD.org> |
Merge r204430 from head to stable/8: Remove stale comment about socket buffer accounting from access(2) code. It is the case, however, that the uidinfo of the temporary credential set up for access(2) is not properly updated when its effective uid is changed. Approved by: re (bz)
|
#
3b23a422 |
|
27-Apr-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r206547: Handle a case in kern_openat() when vn_open() change file type from DTYPE_VNODE.
|
#
bbf79892 |
|
16-Apr-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r206546: Remove XXX comment. Add another comment, describing why f_vnode assignment is useful.
|
#
d5ff2735 |
|
13-Apr-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Handle a case in kern_openat() when vn_open() change file type from DTYPE_VNODE. Only acquire locks for O_EXLOCK/O_SHLOCK if file type is still vnode, since we allow for fcntl(2) to process with advisory locks for DTYPE_VNODE only. Another reason is that all fo_close() routines need to check and release locks otherwise. For O_TRUNC, call fo_truncate() instead of truncating the vnode. Discussed with: rwatson MFC after: 2 week
|
#
5478ba73 |
|
13-Apr-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove XXX comment. Add another comment, describing why f_vnode assignment is useful. MFC after: 3 days
|
#
510ea843 |
|
28-Mar-2010 |
Ed Schouten <ed@FreeBSD.org> |
Rename st_*timespec fields to st_*tim for POSIX 2008 compliance. A nice thing about POSIX 2008 is that it finally standardizes a way to obtain file access/modification/change times in sub-second precision, namely using struct timespec, which we already have for a very long time. Unfortunately POSIX uses different names. This commit adds compatibility macros, so existing code should still build properly. Also change all source code in the kernel to work without any of the compatibility macros. This makes it all a less ambiguous. I am also renaming st_birthtime to st_birthtim, even though it was a local extension anyway. It seems Cygwin also has a st_birthtim.
|
#
bf876fcd |
|
27-Mar-2010 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
MFC r200273: Don't add VAPPEND if the file is not being opened for writing. Note that this only affects cases where open(2) is being used improperly - i.e. when the user specifies O_APPEND without O_WRONLY or O_RDWR. Reviewed by: rwatson
|
#
0fef797f |
|
21-Mar-2010 |
Ed Schouten <ed@FreeBSD.org> |
Actually make O_DIRECTORY work. According to POSIX open() must return ENOTDIR when the path name does not refer to a path name. Change vn_open() to respect this flag. This also simplifies the Linuxolator a bit.
|
#
9e0cda03 |
|
11-Mar-2010 |
John Baldwin <jhb@FreeBSD.org> |
Fix a comment nit. Submitted by: Alexander Best
|
#
7a547c99 |
|
10-Mar-2010 |
John Baldwin <jhb@FreeBSD.org> |
MFC 204638: Allow lseek(SEEK_END) to work on disk devices by using the DIOCGMEDIASIZE to determine the media size.
|
#
ab99e589 |
|
03-Mar-2010 |
John Baldwin <jhb@FreeBSD.org> |
Allow lseek(SEEK_END) to work on disk devices by using the DIOCGMEDIASIZE to determine the media size. Submitted by: nox MFC after: 1 week
|
#
6c77113c |
|
27-Feb-2010 |
Robert Watson <rwatson@FreeBSD.org> |
Remove stale comment about socket buffer accounting from access(2) code. It is the case, however, that the uidinfo of the temporary credential set up for access(2) is not properly updated when its effective uid is changed. MFC after: 3 days
|
#
e268f54c |
|
11-Jan-2010 |
Kirk McKusick <mckusick@FreeBSD.org> |
Background: When renaming a directory it passes through several intermediate states. First its new name will be created causing it to have two names (from possibly different parents). Next, if it has different parents, its value of ".." will be changed from pointing to the old parent to pointing to the new parent. Concurrently, its old name will be removed bringing it back into a consistent state. When fsck encounters an extra name for a directory, it offers to remove the "extraneous hard link"; when it finds that the names have been changed but the update to ".." has not happened, it offers to rewrite ".." to point at the correct parent. Both of these changes were considered unexpected so would cause fsck in preen mode or fsck in background mode to fail with the need to run fsck manually to fix these problems. Fsck running in preen mode or background mode now corrects these expected inconsistencies that arise during directory rename. The functionality added with this update is used by fsck running in background mode to make these fixes. Solution: This update adds three new fsck sysctl commands to support background fsck in correcting expected inconsistencies that arise from incomplete directory rename operations. They are: setcwd(dirinode) - set the current directory to dirinode in the filesystem associated with the snapshot. setdotdot(oldvalue, newvalue) - Verify that the inode number for ".." in the current directory is oldvalue then change it to newvalue. unlink(nameptr, oldvalue) - Verify that the inode number associated with nameptr in the current directory is oldvalue then unlink it. As with all other fsck sysctls, these new ones may only be used by processes with appropriate priviledge. Reported by: jeff Security issues: rwatson
|
#
9d7031a6 |
|
08-Dec-2009 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Don't add VAPPEND if the file is not being opened for writing. Note that this only affects cases where open(2) is being used improperly - i.e. when the user specifies O_APPEND without O_WRONLY or O_RDWR. Reviewed by: rwatson
|
#
93566d2a |
|
09-Sep-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r196887: In fhopen, vfs_ref() the mount point while vnode is unlocked, to prevent vn_start_write(NULL, &mp) from operating on potentially freed or reused struct mount *. Remove unmatched vfs_rel() in cleanup. Approved by: re (kensmith)
|
#
db17314e |
|
06-Sep-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
In fhopen, vfs_ref() the mount point while vnode is unlocked, to prevent vn_start_write(NULL, &mp) from operating on potentially freed or reused struct mount *. Remove unmatched vfs_rel() in cleanup. Noted and reviewed by: tegge Tested by: pho MFC after: 3 days
|
#
e179d138 |
|
31-Aug-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r196560: Honor the vfs.timestamp_precision sysctl settings for utimes(path, NULL) and similar calls. Approved by: re (rwatson)
|
#
4f4946d3 |
|
26-Aug-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Honor the vfs.timestamp_precision sysctl settings for utimes(path, NULL) and similar calls. Obtained from: Petr Salinger, Debian GNU/kFreeBSD, Debian bug #489894 MFC after: 3 days
|
#
87eca70e |
|
31-Jul-2009 |
John Baldwin <jhb@FreeBSD.org> |
Fix some LORs between vnode locks and filedescriptor table locks. - Don't grab the filedesc lock just to read fd_cmask. - Drop vnode locks earlier when mounting the root filesystem and before sanitizing stdin/out/err file descriptors during execve(). Submitted by: kib Approved by: re (rwatson) MFC after: 1 week
|
#
b146fc1b |
|
28-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Rework vnode argument auditing to follow the same structure, in order to avoid exposing ARG_ macros/flag values outside of the audit code in order to name which one of two possible vnodes will be audited for a system call. Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 month
|
#
c3889811 |
|
08-Jul-2009 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
There is an optimization in chmod(1), that makes it not to call chmod(2) if the new file mode is the same as it was before; however, this optimization must be disabled for filesystems that support NFSv4 ACLs. Chmod uses pathconf(2) to determine whether this is the case - however, pathconf(2) always follows symbolic links, while the 'chmod -h' doesn't. This change adds lpathconf(3) to make it possible to solve that problem in a clean way. Reviewed by: rwatson (earlier version) Approved by: re (kib)
|
#
03f7b004 |
|
01-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
For access(2) and eaccess(2), audit the requested access mode. Approved by: re (audit argument blanket) MFC after: 3 days
|
#
422d7866 |
|
01-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Audit the file descriptor number passed to lseek(2). Approved by: re (kib) MFC after: 3 days
|
#
c5957d6b |
|
01-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Fix link(2) auditing: use the second audit record path for the new object name. Approved by: re (kib) MFC after: 3 days
|
#
14961ba7 |
|
27-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Replace AUDIT_ARG() with variable argument macros with a set more more specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis tools (such as fxr). In MFC, we should leave the existing AUDIT_ARG() macros as they may be used by third-party kernel modules. Suggested by: brooks Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 week
|
#
aed2a061 |
|
13-Jun-2009 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Remove the static from int hardlink_check_uid. There is an external use in the opensolaris code. I am not sure how this ever worked but I have seen two reports of: link_elf: symbol hardlink_check_uid undefined lately. Reported by: Scott Ullrich (sullrich gmail.com), pfsense Reported by: Mister Olli (mister.olli googlemail.com)
|
#
27bfb741 |
|
08-Jun-2009 |
Paul Saab <ps@FreeBSD.org> |
Simply shared vnode locking and extend it to also include fsync. Also, in vop_write, no longer assert for exclusive locks on the vnode. Reviewed by: jhb, kmacy, jeffr
|
#
bcf11e8d |
|
05-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
|
#
0304c731 |
|
27-May-2009 |
Jamie Gritton <jamie@FreeBSD.org> |
Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)
|
#
bf422e5f |
|
13-May-2009 |
Jeff Roberson <jeff@FreeBSD.org> |
- Implement a lockless file descriptor lookup algorithm in fget_unlocked(). - Save old file descriptor tables created on expansion until the entire descriptor table is freed so that pointers may be followed without regard for expanders. - Mark the file zone as NOFREE so we may attempt to reference potentially freed files. - Convert several fget_locked() users to fget_unlocked(). This requires us to manage reference counts explicitly but reduces locking overhead in the common case.
|
#
3b616fae |
|
11-May-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Prevent overflow of uio_resid. Noted by: jhb MFC after: 3 days
|
#
dfd233ed |
|
11-May-2009 |
Attilio Rao <attilio@FreeBSD.org> |
Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.
|
#
885868cd |
|
10-Apr-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Remove VOP_LEASE and supporting functions. This hasn't been used since the removal of NQNFS, but was left in in case it was required for NFSv4. Since our new NFSv4 client and server can't use it for their requirements, GC the old mechanism, as well as other unused lease- related code and interfaces. Due to its impact on kernel programming and binary interfaces, this change should not be MFC'd. Proposed by: jeff Reviewed by: jeff Discussed with: rmacklem, zach loafman @ isilon
|
#
0eee862a |
|
20-Feb-2009 |
Ed Schouten <ed@FreeBSD.org> |
Don't make Linux stat() open character devices to resolve its name. The existing code calls kern_open() to resolve the vnode of a pathname right after a stat(). This is not correct, because it causes random character devices to be opened in /dev. This means ls'ing a tape streamer will cause it to rewind, for example. Changes I have made: - Add kern_statat_vnhook() to allow binary emulators to `post-process' struct stat, using the proper vnode. - Remove unneeded printf's from stat() and statfs(). - Make the Linuxolator use kern_statat_vnhook(), replacing translate_path_major_minor_at(). - Let translate_fd_major_minor() use vp->v_rdev instead of vp->v_un.vu_cdev. Result: crw-rw-rw- 1 root root 0, 14 Feb 20 13:54 /dev/ptmx crw--w---- 1 root adm 136, 0 Feb 20 14:03 /dev/pts/0 crw--w---- 1 root adm 136, 1 Feb 20 14:02 /dev/pts/1 crw--w---- 1 ed tty 136, 2 Feb 20 14:03 /dev/pts/2 Before this commit, ptmx also had a major number of 136, because it silently allocated and deallocated a pseudo-terminal. Device nodes that cannot be opened now have proper major/minor-numbers. Reviewed by: kib, netchild, rdivacky (thanks!)
|
#
ea77ff0a |
|
13-Feb-2009 |
John Baldwin <jhb@FreeBSD.org> |
Use shared vnode locks when invoking VOP_READDIR(). MFC after: 1 month
|
#
27dd8057 |
|
05-Feb-2009 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
In some situations, mnt_lockref could go negative due to vfs_unbusy() being called without calling vfs_busy() first. This made umount(8) hang waiting for mnt_lockref to become zero, which would never happen. Reviewed by: kib Approved by: rwatson (mentor) Reported by: pho Found with: stress2 Sponsored by: FreeBSD Foundation
|
#
efc65197 |
|
23-Jan-2009 |
John Baldwin <jhb@FreeBSD.org> |
Use shared vnode locks for fchdir(). Submitted by: ups
|
#
ab62a2d0 |
|
27-Dec-2008 |
Peter Holm <pho@FreeBSD.org> |
Prevent overflow of uio_resid. Approved by: kib
|
#
548066ea |
|
17-Dec-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
The quotactl, statfs and fstatfs syscall implementations may dereference NULL pointer to struct mount if the looked up vnode is reclaimed. Also, these syscalls only mnt_ref() the mp, still allowing it to be unmounted; only struct mount memory is kept from being reused. Lock the vnode when doing name lookup, then reference its mount point, unlock the vnode and vfs_busy the mountpoint. This sequence shall take care of both races. Reported and tested by: pho Discussed with: attilio MFC after: 1 month
|
#
61791644 |
|
29-Nov-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
In the nfsrv_fhtovp(), after the vfs_getvfs() function found the pointer to the fs, but before a vnode on the fs is locked, unmount may free fs structures, causing access to destroyed data and freed memory. Introduce a vfs_busymp() function that looks up and busies found fs while mountlist_mtx is held. Use it in nfsrv_fhtovp() and in the implementation of the handle syscalls. Two other uses of the vfs_getvfs() in the vfs_subr.c, namely in sysctl_vfs_ctl and vfs_getnewfsid seems to be ok. In particular, sysctl_vfs_ctl is protected by Giant by being a non-sleeping sysctl handler, that prevents Giant-locked unmount code to interfere with it. Noted by: tegge Reviewed by: dfr Tested by: pho MFC after: 1 month
|
#
e506f34b |
|
05-Nov-2008 |
Craig Rodrigues <rodrigc@FreeBSD.org> |
Merge latest DTrace changes from Perforce. Approved by: jb
|
#
927edcc9 |
|
04-Nov-2008 |
John Baldwin <jhb@FreeBSD.org> |
Use shared vnode locks for auditing vnode arguments as auditing only does a VOP_GETATTR() which does not require an exclusive lock. Reviewed by: csjp, rwatson
|
#
21fc02d2 |
|
03-Nov-2008 |
John Baldwin <jhb@FreeBSD.org> |
Use shared vnode locks instead of exclusive vnode locks for the access(), chdir(), chroot(), eaccess(), fpathconf(), fstat(), fstatfs(), lseek() (when figuring out the current size of the file in the SEEK_END case), pathconf(), readlink(), and statfs() system calls. Submitted by: ups (mostly) Tested by: pho MFC after: 1 month
|
#
83b3bdbc |
|
02-Nov-2008 |
Attilio Rao <attilio@FreeBSD.org> |
Improve VFS locking: - Implement real draining for vfs consumers by not relying on the mnt_lock and using instead a refcount in order to keep track of lock requesters. - Due to the change above, remove the mnt_lock lockmgr because it is now useless. - Due to the change above, vfs_busy() is no more linked to a lockmgr. Change so its KPI by removing the interlock argument and defining 2 new flags for it: MBF_NOWAIT which basically replaces the LK_NOWAIT of the old version (which was unlinked from the lockmgr alredy) and MBF_MNTLSTLOCK which provides the ability to drop the mountlist_mtx once the mnt interlock is held (ability still desired by most consumers). - The stub used into vfs_mount_destroy(), that allows to override the mnt_ref if running for more than 3 seconds, make it totally useless. Remove it as it was thought to work into older versions. If a problem of "refcount held never going away" should appear, we will need to fix properly instead than trust on such hackish solution. - Fix a bug where returning (with an error) from dounmount() was still leaving the MNTK_MWAIT flag on even if it the waiters were actually woken up. Just a place in vfs_mount_destroy() is left because it is going to recycle the structure in any case, so it doesn't matter. - Remove the markercnt refcount as it is useless. This patch modifies VFS ABI and breaks KPI for vfs_busy() so manpages and __FreeBSD_version will be modified accordingly. Discussed with: kib Tested by: pho
|
#
15bc6b2b |
|
28-Oct-2008 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)
|
#
62397526 |
|
23-Oct-2008 |
John Baldwin <jhb@FreeBSD.org> |
Whitespace fix.
|
#
1ede983c |
|
23-Oct-2008 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months
|
#
63f8fe9e |
|
22-Oct-2008 |
John Baldwin <jhb@FreeBSD.org> |
Split the copyout of *base at the end of getdirentries() out leaving the rest in kern_getdirentries(). Use kern_getdirentries() to implement freebsd32_getdirentries(). This fixes a bug where calls to getdirentries() in 32-bit binaries would trash the 4 bytes after the 'long base' in userland. Submitted by: ups MFC after: 1 week
|
#
d7f03759 |
|
19-Oct-2008 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Import the HEAD csup code which is the basis for the cvsmode work.
|
#
2765482b |
|
01-Sep-2008 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
When setting error to EINVAL in 'fvp == tdvp' case, jump to out label, because if not, the error will be later overwritten by mac_vnode_check_rename_to() call. Reviewed by: rwatson
|
#
59d49325 |
|
31-Aug-2008 |
Attilio Rao <attilio@FreeBSD.org> |
Decontextualize vfs_busy(), vfs_unbusy() and vfs_mount_alloc() functions. Manpages are updated accordingly. Tested by: Diego Sardina <siarodx at gmail dot com>
|
#
0359a12e |
|
28-Aug-2008 |
Attilio Rao <attilio@FreeBSD.org> |
Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
#
3319d712 |
|
22-Jun-2008 |
Robert Watson <rwatson@FreeBSD.org> |
If S_IFIFO is passed to mknod(2), invoke kern_mkfifoat(9) to create a FIFO, as required by SUSv3. No specific privilege check is performed in this case, as FIFOs may be created by unprivileged processes (subject to the normal file system name space restrictions that may be in place). Unlike the Apple implementation, we reject requests to create a FIFO using mknod(2) if there is a non-zero dev argument to the system call, which is permitted by the Open Group specification ("... undefined ..."). We might want to revise this if we find it causes compatibility problems for applications in practice. PR: kern/74242, kern/68459 Obtained from: Apple, Inc. MFC after: 3 weeks
|
#
8a372438 |
|
06-Apr-2008 |
Don Lewis <truckman@FreeBSD.org> |
vfs_syscalls.c 1.452 mistakenly swapped the behavior of chown() and lchown().
|
#
e4193f25 |
|
30-Mar-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement the openat(2), faccessat(2), fchmodat(2), fchownat(2), fstatat(2), futimesat(2), linkat(2), mkdirat(2), mkfifoat(2), mknodat(2), readlinkat(2), renameat(2), symlinkat(2) syscalls. Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho
|
#
e314f69f |
|
31-Mar-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Add the support for the O_EXEC open(2) mode, as specified by the POSIX Extended API Set Part 2 extension specification. Reviewed by: rwatson, rdivacky Tested by: pho
|
#
60e15db9 |
|
22-Feb-2008 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
This patch adds a new ktrace(2) record type, KTR_STRUCT, whose payload consists of the null-terminated name and the contents of any structure you wish to record. A new ktrstruct() function constructs and emits a KTR_STRUCT record. It is accompanied by convenience macros for struct stat and struct sockaddr. In kdump(1), KTR_STRUCT records are handled by a dispatcher function that runs stringent sanity checks on its contents before handing it over to individual decoding funtions for each type of structure. Currently supported structures are struct stat and struct sockaddr for the AF_INET, AF_INET6 and AF_UNIX families; support for AF_APPLETALK and AF_IPX is present but disabled, as I am unable to test it properly. Since 's' was already taken, the letter 't' is used by ktrace(1) to enable KTR_STRUCT trace points, and in kdump(1) to enable their decoding. Derived from patches by Andrew Li <andrew2.li@citi.com>. PR: kern/117836 MFC after: 3 weeks
|
#
5f56182b |
|
12-Feb-2008 |
Ruslan Ermilov <ru@FreeBSD.org> |
Change readlink(2)'s return type and type of the last argument to match POSIX. Prodded by: Alexey Lyashkov
|
#
22db15c0 |
|
13-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
#
cb05b60a |
|
09-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
#
e4650294 |
|
07-Jan-2008 |
John Baldwin <jhb@FreeBSD.org> |
Make ftruncate a 'struct file' operation rather than a vnode operation. This makes it possible to support ftruncate() on non-vnode file types in the future. - 'struct fileops' grows a 'fo_truncate' method to handle an ftruncate() on a given file descriptor. - ftruncate() moves to kern/sys_generic.c and now just fetches a file object and invokes fo_truncate(). - The vnode-specific portions of ftruncate() move to vn_truncate() in vfs_vnops.c which implements fo_truncate() for vnode file types. - Non-vnode file types return EINVAL in their fo_truncate() method. Submitted by: rwatson
|
#
397c19d1 |
|
29-Dec-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove explicit locking of struct file. - Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them. Tested by: kris, pho
|
#
30d239bc |
|
24-Oct-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
|
#
45e0f3d6 |
|
09-Sep-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Rename mac_check_vnode_delete() MAC Framework and MAC Policy entry point to mac_check_vnode_unlink(), reflecting UNIX naming conventions. This is the first of several commits to synchronize the MAC Framework in FreeBSD 7.0 with the MAC Framework as it will appear in Mac OS X Leopard. Reveiwed by: csjp, Samy Bahra <sbahra at gwu dot edu> Submitted by: Jacques Vidrine <nectar at apple dot com> Obtained from: Apple Computer, Inc. Sponsored by: SPARTA, SPAWAR Approved by: re (bmah)
|
#
cc479dda |
|
28-Aug-2007 |
John Baldwin <jhb@FreeBSD.org> |
Rework the routines to convert a 5.x+ statfs structure (with fixed-size 64-bit counters) to a 4.x statfs structure (with long-sized counters). - For block counters, we scale up the block size sufficiently large so that the resulting block counts fit into a the long-sized (long for the ABI, so 32-bit in freebsd32) counters. In 4.x the NFS client's statfs VOP did this already. This can lie about the block size to 4.x binaries, but it presents a more accurate picture of the ratios of free and available space. - For non-block counters, fix the freebsd32 stats converter to cap the values at INT32_MAX rather than losing the upper 32-bits to match the behavior of the 4.x statfs conversion routine in vfs_syscalls.c Approved by: re (kensmith)
|
#
c2815ad5 |
|
04-Jul-2007 |
Peter Wemm <peter@FreeBSD.org> |
Add freebsd6_ wrappers for mmap/lseek/pread/pwrite/truncate/ftruncate Approved by: re (kensmith)
|
#
32f9753c |
|
11-Jun-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project
|
#
9e223287 |
|
31-May-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)
|
#
5c76452f |
|
04-May-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
Mark the filedescriptor table entries with VOP_OPEN being performed for them as UF_OPENING. Disable closing of that entries. This should fix the crashes caused by devfs_open() (and fifo_open()) dereferencing struct file * by index, while the filedescriptor is closed by parallel thread. Idea by: tegge Reviewed by: tegge (previous version of patch) Tested by: Peter Holm Approved by: re (kensmith) MFC after: 3 weeks
|
#
f6521d1c |
|
05-Apr-2007 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Implement SEEK_DATA and SEEK_HOLE extensions to lseek(2) as found in OpenSolaris. For more information please refer to: http://blogs.sun.com/bonwick/entry/seek_hole_and_seek_data
|
#
5e3f7694 |
|
04-Apr-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff
|
#
ebb3c22c |
|
02-Apr-2007 |
John Baldwin <jhb@FreeBSD.org> |
Don't go to a whole lot of extra work to handle the race where the new file descriptor is closed out from under us in kern_open(). This race is already handled and the file will be closed when kern_open() does an fdrop just before returning.
|
#
ecd82461 |
|
21-Mar-2007 |
John Baldwin <jhb@FreeBSD.org> |
If vn_open() fails during kern_open(), don't fdrop() the new file object until after the call to fdclose(). This closes an obscure race that could result in the later call to fdclose() actually closing a different file descriptor if another thread close()'s the file descriptor being opened before fdrop() is called, so the fdrop() in kern_open() frees the file object, then the second thread (or a third) creates a new file descriptor which reuses both the same index and the same file pointer thus tricking fdclose() in the first thread into thinking that the original file was still open. MFC after: 1 week
|
#
71d49316 |
|
14-Mar-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
Busy filesystem around call of VFS_QUOTACTL() vfs op. Tested by: Peter Holm Reviewed by: tegge Approved by: re (kensmith)
|
#
873fbcd7 |
|
05-Mar-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Further system call comment cleanup: - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments.
|
#
9b2f1a07 |
|
19-Feb-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove union_dircheckp hook, it is not needed by new unionfs code anymore. As consequence, getdirentries() no longer needs to drop/reacquire directory vnode lock, that would allow it to be reclaimed in between. Reported and tested by: Peter Holm Approved by: rodrigc (unionfs) MFC after: 1 week
|
#
10bcafe9 |
|
15-Feb-2007 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs
|
#
168d0553 |
|
22-Dec-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Following a repo-copy of vfs_syscalls.c to vfs_extattr.c, remove non-extattr functions from vfs_extattr.c, and extattr functions from vfs_syscalls.c. Change copyright/license on vfs_extattr.c to my copyright/license on the extended attribute implementation (from extattr.h). Clean up includes a bit. Obtained from: TrustedBSD Project
|
#
acd3428b |
|
06-Nov-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
#
9a969e62 |
|
26-Oct-2006 |
Konstantin Belousov <kib@FreeBSD.org> |
The attempt to rename "." with MAC framework compiled in would cause attempt to twice unlock the vnode. Check that ni_vp and ni_dvp are different before doing second unlock. Reviewed by: rwatson Approved by: pjd (mentor) MFC after: 1 week
|
#
aed55708 |
|
22-Oct-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA
|
#
a1e363f2 |
|
25-Sep-2006 |
Tor Egge <tegge@FreeBSD.org> |
Add mnt_noasync counter to better handle interleaved calls to nmount(), sync() and sync_fsync() without losing MNT_ASYNC. Add MNTK_ASYNC flag which is set only when MNT_ASYNC is set and mnt_noasync is zero, and check that flag instead of MNT_ASYNC before initiating async io.
|
#
5da56ddb |
|
25-Sep-2006 |
Tor Egge <tegge@FreeBSD.org> |
Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag. This eliminates a race where MNT_UPDATE flag could be lost when nmount() raced against sync(), sync_fsync() or quotactl().
|
#
783deec1 |
|
20-Sep-2006 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
There is no need to set 'sp' to NULL anymore.
|
#
4e59868e |
|
19-Sep-2006 |
Tor Egge <tegge@FreeBSD.org> |
Copy stat information from mount structure before it can change identity.
|
#
5702e096 |
|
17-Sep-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Declare security and security.bsd sysctl hierarchies in sysctl.h along with other commonly used sysctl name spaces, rather than declaring them all over the place. MFC after: 1 month Sponsored by: nCircle Network Security, Inc.
|
#
9802d04c |
|
02-Aug-2006 |
John Baldwin <jhb@FreeBSD.org> |
Fix some bugs in the previous revision (1.419). Don't perform extra vfs_rel() on the mountpoint if the MAC checks fail in kern_statfs() and kern_fstatfs(). Similarly, don't perform an extra vfs_rel() if we get a doomed vnode in kern_fstatfs(), and handle the case of mp being NULL (for some doomed vnodes) by conditionalizing the vfs_rel() in kern_fstatfs() on mp != NULL. CID: 1517 Found by: Coverity Prevent (tm) (kern_fstatfs()) Pointy hat to: jhb
|
#
ea175645 |
|
27-Jul-2006 |
John Baldwin <jhb@FreeBSD.org> |
Hold the reference on the mountpoint slightly longer in kern_statfs() and kern_fstatfs() so that it is still held when prison_enforce_statfs() is called (since that function likes to poke and prod at the mountpoint structure). MFC after: 3 days
|
#
2f198e89 |
|
19-Jul-2006 |
John Baldwin <jhb@FreeBSD.org> |
Call change_dir() instead of duplicating the code in fchdir().
|
#
be5747d5 |
|
11-Jul-2006 |
John Baldwin <jhb@FreeBSD.org> |
- Add conditional VFS Giant locking to getdents_common() (linux ABIs), ibcs2_getdents(), ibcs2_read(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() similar to that in getdirentries(). - Mark ibcs2_getdents(), ibcs2_read(), linux_getdents(), linux_getdents64(), linux_readdir(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() MPSAFE.
|
#
65ee602e |
|
06-Jul-2006 |
Wayne Salamon <wsalamon@FreeBSD.org> |
Audit the remaining parameters to the extattr system calls. Generate the audit records for those calls. Obtained from: TrustedBSD Project Approved by: rwatson (mentor)
|
#
6e79e6f8 |
|
05-Jun-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Audit command, uid arguments for quotactl(). Audit the mode argument to mkfifo(). Audit the target path passed to symlink(). Submitted by: wsalamon Obtained from: TrustedBSD Project
|
#
3bbd6d8a |
|
30-Mar-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- Release the references acquired by VOP_GETWRITEMOUNT and vfs_getvfs(). Discussed with: tegge Tested by: kris Sponsored by: Isilon Systems, Inc.
|
#
861dab08 |
|
28-Mar-2006 |
John Baldwin <jhb@FreeBSD.org> |
Change vn_open() to honor the MPSAFE flag in the passed in nameidata object and use that instead of testing fdidx against -1 to determine if it should release Giant if Giant was locked due to the requested file residing on a non-MPSAFE VFS. Discussed with: jeff
|
#
77c79550 |
|
21-Mar-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- Remove explicit calls to lock and unlock Giant and replace them with VFS_LOCK_GIANT/VFS_UNLOCK_GIANT calls. This completely removes Giant acquisition in the syscall path for ffs. Bug fix to kern_fhstatfs from: Todd Miller <Todd.Miller@sparta.com> Sponsored by: Isilon Systems, Inc.
|
#
6308f39d |
|
03-Mar-2006 |
Paul Saab <ps@FreeBSD.org> |
use strlcpy in cvtstatfs and copy_statfs instead of bcopy to ensure the copied strings are properly terminated. bzero the statfs32 struct in copy_statfs.
|
#
6815739e |
|
03-Mar-2006 |
Paul Saab <ps@FreeBSD.org> |
Don't truncate f_mntfromname & f_mntonname to 16 characters when translating statfs into ostatfs. This allows 4.x binaries making statfs calls to work on 6.x.
|
#
8febcfb9 |
|
22-Feb-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- Use vfs_ref/rel to protect a mountpoint from going away while VFS_STATFS is being called. Be sure to grab the ref before we unlock the vnode to prevent the mount from disappearing. Tested by: kris
|
#
bc5504b9 |
|
22-Feb-2006 |
Wayne Salamon <wsalamon@FreeBSD.org> |
Add pathname and/or vnode argument auditing for the following system calls: quotactl, statfs, fstatfs, fchdir, chdir, chroot, open, mknod, mkfifo, link, symlink, undelete, unlink, access, eaccess, stat, lstat, pathconf, readlink, chflags, lchflags, fchflags, chmod, lchmod, fchmod, chown, lchown, fchown, utimes, lutimes, futimes, truncate, ftruncate, fsync, rename, mkdir, rmdir, getdirentries, revoke, lgetfh, getfh, extattrctl, extattr_set_file, extattr_set_link, extattr_get_file, extattr_get_link, extattr_delete_file, extattr_delete_link, extattr_list_file, extattr_list_link. In many cases the pathname and vnode auditing is done within namei lookup instead of directly in the system call. Audit the remaining arguments to these system calls: fstatfs, fchdir, open, mknod, chflags, lchflags, fchflags, chmod, lchmod, fchmod, chown, lchown, fchown, futimes, ftruncate, fsync, mkdir, getdirentries.
|
#
c5dcb840 |
|
22-Feb-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- Revert r1.406 until a solution can be found that doesn't break nfs. The statfs handler in nfs will lock vnodes which may lead to deadlock or recursion. Found by: kris Pointy hat to: me
|
#
05b6a20a |
|
21-Feb-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- Hold the vnode used in the statfs related functions until we're done with the VFS_STATFS call to prevent the mount from disappearing while we're stating. - Convert these routines to use MPSAFE namei semantics. MFC After: 1 week
|
#
809f984b |
|
06-Feb-2006 |
John Baldwin <jhb@FreeBSD.org> |
Add a kern_eaccess() function and use it to implement xenix_eaccess() rather than kern_access(). Suggested by: rwatson
|
#
2f0bca55 |
|
06-Feb-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- Don't check v_mount for NULL to determine if a vnode has been recycled. Use the more appropriate VI_DOOMED flag instead. Sponsored by: Isilon Systems, Inc. MFC After: 1 week
|
#
59428b0b |
|
03-Feb-2006 |
Robert Watson <rwatson@FreeBSD.org> |
In fchdir(), Giant must be separately acquired and dropped if the old vnode is from a file system that is not MPSAFE, as vrele() expects Giant to be held when it is called on a non-MPSAFE vnode. Spotted by: kris Tested by: glebius
|
#
0ac72424 |
|
01-Feb-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- chroot and chdir need to lock giant as appropriate for the outgoing vp as well as the new vp. Sponsored by: Isilon Systems, Inc. MFC After: 3 days
|
#
89b0e109 |
|
31-Jan-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- Reorder calls to vrele() after calls to vput() when the vrele is a directory. vrele() may lock the passed vnode, which in these cases would give an invalid lock order of child -> parent. These situations are deadlock prone although do not typically deadlock because the vrele is typically not releasing the last reference to the vnode. Users of vrele must consider it as a call to vn_lock() and order it appropriately. MFC After: 1 week Sponsored by: Isilon Systems, Inc. Tested by: kkenn
|
#
1dd5fc0f |
|
22-Jan-2006 |
Don Lewis <truckman@FreeBSD.org> |
Tweak previous vfs_lookup.c commit to return an EINVAL error from lookup() instead of EPERM when a DELETE or RENAME operation is attempted on "..". In kern_unlink(), remap EINVAL errors returned from namei() to EPERM to match existing (and POSIX required) behaviour. Discussed with: bde MFC after: 3 days
|
#
c3d78136 |
|
04-Jan-2006 |
Diomidis Spinellis <dds@FreeBSD.org> |
Fix style bug. Prompted by: bde
|
#
f8ccc6ce |
|
03-Jan-2006 |
Diomidis Spinellis <dds@FreeBSD.org> |
Replace tv_usec normalization with the return of EINVAL. This addresses two objections to the previous behavior, and unbreaks the alpha tinderbox build. TODO: update the utimes(2) man page.
|
#
51339e85 |
|
03-Jan-2006 |
Diomidis Spinellis <dds@FreeBSD.org> |
Normalize the tv_usec part of the utimes(2) arguments to ensure that a file's atime and mtime are only set to correct fractional second values (0-999999000ns with the current interface). Prior to this change users could create files with values outside that range. Moreover, on 32-bit machines tv_usec offsets larger than 4.3s would result in an unnormalized AND wrong timestamp value, due to overflow. MFC after: 1 week
|
#
c505fe7a |
|
19-Dec-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Reduce Giant scope a bit, as fdrop() is believed to be MPSAFE. The purpose of this change is consistency (not performance improvement:)), as it was hard to tell if fdrop() is MPSAFE or not when I saw it sometimes under the Giant and sometimes without it. Glanced at by: ssouhlal, kan
|
#
c47a4d1c |
|
24-Sep-2005 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Implement new world order in VFS locking for extended attributes. This will remove the unconditional acquisition of Giant for extended attribute related operations. If the file system is set as being MP safe and debug.mpsafevfs is 1, do not pickup Giant. Mark the following system calls as being MP safe so we no longer pickup Giant in the system call handler: o extattrctl o extattr_set_file o extattr_get_file o extattr_delete_file o extattr_set_fd o extattr_get_fd o extattr_delete_fd o extattr_set_link o extattr_get_link o extattr_delete_link o extattr_list_file o extattr_list_link o extattr_list_fd -Pass MPSAFE flags to namei(9) lookup and introduce vfslocked variable which will keep track of any Giant acquisitions. -Wrap any fd operations which manipulate vnodes in VFS_{UN}LOCK_GIANT -Drop VFS_ASSERT_GIANT into function which operate on vnodes to ensure that we are sufficiently protected. I've tested these changes with various TrustedBSD MAC policies which use extended attribute a lot on SMP and UP systems (thanks to Scott Long for making some SMP hardware available to me for testing). Discussed with: jeff Requested by: jhb, rwatson
|
#
68ff2a43 |
|
15-Sep-2005 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Improve the MP safeness associated with the creation of symbolic links and the execution of ELF binaries. Two problems were found: 1) The link path wasn't tagged as being MP safe and thus was not properly protected. 2) The ELF interpreter vnode wasnt being locked in namei(9) and thus was insufficiently protected. This commit makes the following changes: -Sets the MPSAFE flag in NDINIT for symbolic link paths -Sets the MPSAFE flag in NDINIT and introduce a vfslocked variable which will be used to instruct VFS_UNLOCK_GIANT to unlock Giant if it has been picked up. -Drop in an assertion into vfs_lookup which ensures that if the MPSAFE flag is NOT set, that we have picked up giant. If not panic (if WITNESS compiled into the kernel). This should help us find conditions where vnode operations are in-sufficiently protected. This is a RELENG_6 candidate. Discussed with: jeff MFC after: 4 days
|
#
d8b464e5 |
|
01-Sep-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
In case of mac_check_vnode_rename_from() or vn_start_write() failure, vn_finished_write() should not be called. Reviewed by: ssouhlal MFC after: 3 days
|
#
06a13778 |
|
23-Jun-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Actually only protect mount-point if security.jail.enforce_statfs is set to 2. If we don't return statistics about requested file systems, system tools may not work correctly or at all. Approved by: re (scottl)
|
#
dbb3ec5c |
|
13-Jun-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Remove vnode lock asserts at the end of vfs syscalls. These asserts were used to ensure that we weren't exiting the syscall with a lock still held. This wasn't safe, however, because we'd already executed a vput() and on a loaded system the vnode may have been free'd by the time we assert. This functionality is also handled by the td_locks assert in userret, which doesn't tell you what the syscall was, but will at least panic before you deadlock. Sponsored by: Isilon Systems, Inc. Discovred by: Peter Holm Approved by: re (blanket vfs)
|
#
65ac438c |
|
12-Jun-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Do not allocate memory while holding a mutex. I introduce a very small race here (some file system can be mounted or unmounted between 'count' calculation and file systems list creation), but it is harmless. Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/ Reported by: Peter Holm <peter@holm.cc>
|
#
3a996d6e |
|
11-Jun-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Do not allocate memory based on not-checked argument from userland. It can be used to panic the kernel by giving too big value. Fix it by moving allocation and size verification into kern_getfsstat(). This even simplifies kern_getfsstat() consumers, but destroys symmetry - memory is allocated inside kern_getfsstat(), but has to be freed by the caller. Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/ Reported by: Peter Holm <peter@holm.cc>
|
#
820a0de9 |
|
09-Jun-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfs and extend its functionality: value policy 0 show all mount-points without any restrictions 1 show only mount-points below jail's chroot and show only part of the mount-point's path (if jail's chroot directory is /jails/foo and mount-point is /jails/foo/usr/home only /usr/home will be shown) 2 show only mount-point where jail's chroot directory is placed. Default value is 2. Discussed with: rwatson
|
#
13a82b96 |
|
09-Jun-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Avoid code duplication in serval places by introducing universal kern_getfsstat() function. Obtained from: jhb
|
#
1209e08f |
|
08-Jun-2005 |
Craig Rodrigues <rodrigc@FreeBSD.org> |
Initialize uio_iovcnt to 1 in extattr_list_vp() and extattr_get_vp() PR: kern/79357 Approved by: rwatson
|
#
f8e5f642 |
|
28-May-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Acquire Giant explicitly in quotactl() so that the syscalls.master entry can become MSTD.
|
#
f73e1f57 |
|
27-May-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Acquire Giant explicitly in fhopen(), fhstat(), and kern_fhstatfs(), so that we can start to eliminate the presence of non-MPSAFE system call entries in syscalls.master.
|
#
9e9cfdca |
|
27-May-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Remove (now) unused argument 'td' from cvtstatfs().
|
#
2b19a055 |
|
27-May-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Sync locking in freebsd4_getfsstat() with getfsstat(). Giant is probably also needed in kern_fhstatfs().
|
#
fd916868 |
|
27-May-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Use consistent style in functions I want to modify in the near future.
|
#
c95cbf7e |
|
22-May-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Protect fsid in freebsd4_getfsstat() in simlar way as it is done in getfsstat().
|
#
a0e96a49 |
|
22-May-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
If we need to hide fsid, kern_statfs()/kern_fstatfs() will do it for us, so do not duplicate the code in cvtstatfs(). Note, that we now need to clear fsid in freebsd4_getfsstat(). This moves all security related checks from functions like cvtstatfs() and will allow to add more security related stuff (like statfs(2), etc. protection for jails) a bit easier.
|
#
836c5b41 |
|
11-Apr-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- vput(tvp) before vrele(tdvp) in kern_rename() to avoid lock order issues.
|
#
5ef9827c |
|
08-Apr-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Remove the namei NOOBJ flag. It is meaningless now. Sponsored by: Isilon Systems, Inc.
|
#
d830f828 |
|
24-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Pass LK_EXCLUSIVE to VFS_ROOT() to satisfy the new flags argument. For now, all calls to VFS_ROOT() should still acquire exclusive locks. Sponsored by: Isilon Systems, Inc.
|
#
ae88db8a |
|
23-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Remove the #ifdef LOOKUP_SHARED from some calls to NDINIT. The LOCKSHARED flag is simply ignored in namei() if LOOKUP_SHARED is not enabled. Sponsored by: Isilon Systems, Inc.
|
#
9331fd13 |
|
13-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Don't VOP_UNLOCK prior to VOP_REVOKE. The lock is required now. Sponsored by: Isilon Systems, Inc.
|
#
88e5b12a |
|
08-Feb-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Drag another softupdates tentacle back into FFS: Now that FFS's vop_fsync is separate from the internal use we can do the full job there.
|
#
fee4a6af |
|
07-Feb-2005 |
John Baldwin <jhb@FreeBSD.org> |
Implement a kern_pathconf() wrapper for pathconf() which can take the filename from either a user space or a kernel space pointer.
|
#
76951d21 |
|
07-Feb-2005 |
John Baldwin <jhb@FreeBSD.org> |
- Tweak kern_msgctl() to return a copy of the requested message queue id structure in the struct pointed to by the 3rd argument for IPC_STAT and get rid of the 4th argument. The old way returned a pointer into the kernel array that the calling function would then access afterwards without holding the appropriate locks and doing non-lock-safe things like copyout() with the data anyways. This change removes that unsafeness and resulting race conditions as well as simplifying the interface. - Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(), fstatfs(), and fhstatfs(). Use these wrappers to cut out a lot of code duplication for freebsd4 and netbsd compatability system calls. - Add a new lookup function kern_alternate_path() that looks up a filename under an alternate prefix and determines which filename should be used. This is basically a more general version of linux_emul_convpath() that can be shared by all the ABIs thus allowing for further reduction of code duplication.
|
#
ff05fd5d |
|
02-Feb-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Correct a typo in kern_rename. tvfslocked should be initialized from tond and not fromnd. This could lead us to leak Giant, or unlock it twice, depending on the filesystems involved. renames within a single filesystem would not have caused any problems. Sponsored by: Isilon Systems, Inc.
|
#
37c15216 |
|
01-Feb-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Or MPSAFE with the correct set of flags in stat(). This affected only the LOOKUP_SHARED case. Spotted by: jhb
|
#
8516dd18 |
|
24-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't use VOP_GETVOBJECT, use vp->v_object directly.
|
#
dcff5b14 |
|
24-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't call VOP_CREATEVOBJECT(), it's the responsibility of the filesystem which owns the vnode.
|
#
94a94585 |
|
24-Jan-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Change all vfs syscalls to use VFS_LOCK_GIANT(), and MPSAFE nds. - Move Giant acquisition into the few vfs syscalls that weren't already directly acquiring it. Sponsored By: Isilon Systems, Inc.
|
#
e39db32a |
|
12-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Ditch vfs_object_create() and make the callers call VOP_CREATEVOBJECT() directly.
|
#
8df6bac4 |
|
11-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC(). I'm not sure why a credential was added to these in the first place, it is not used anywhere and it doesn't make much sense: The credentials for syncing a file (ability to write to the file) should be checked at the system call level. Credentials for syncing one or more filesystems ("none") should be checked at the system call level as well. If the filesystem implementation needs a particular credential to carry out the syncing it would logically have to the cached mount credential, or a credential cached along with any delayed write data. Discussed with: rwatson
|
#
9454b2d8 |
|
06-Jan-2005 |
Warner Losh <imp@FreeBSD.org> |
/* -> /*- for copyright notices, minor format tweaks as necessary
|
#
9bb42816 |
|
16-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Eliminate pointless goto.
|
#
48ab5b2d |
|
15-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Forgot to remove now unused variable in last commit.
|
#
136211e5 |
|
15-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
It is not necessary to hold vn_start_write/vn_finished_write around VOP_REVOKE.
|
#
718fe8e2 |
|
15-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Next FILEDESC_LOCK properly around FILE_LOCK
|
#
124e4c3b |
|
13-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST. Use this in all the places where sleeping with the lock held is not an issue. The distinction will become significant once we finalize the exact lock-type to use for this kind of case.
|
#
ef11fbd7 |
|
07-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Introduce fdclose() which will clean an entry in a filedesc. Replace homerolled versions with call to fdclose(). Make fdunused() static to kern_descrip.c
|
#
56f21b9d |
|
26-Jul-2004 |
Colin Percival <cperciva@FreeBSD.org> |
Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb
|
#
f257b7a5 |
|
12-Jul-2004 |
Alfred Perlstein <alfred@FreeBSD.org> |
Make VFS_ROOT() and vflush() take a thread argument. This is to allow filesystems to decide based on the passed thread which vnode to return. Several filesystems used curthread, they now use the passed thread.
|
#
4f3bf9b9 |
|
24-Jun-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Don't cuddle else's so much as we removed additional parts of each block.
|
#
5e11031e |
|
24-Jun-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Remove temporary API bandage that allowed applications speaking the older API to list attributes on a file (zero-length attribute name) to function. extattr_list_*() are now the only available APIs to use when listing attributes.
|
#
9260798f |
|
21-Jun-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Acquire Giant in link() so that the system call can be marked MPSAFE. Don't want to acquire Giant in kern_link() sync linux compat code performs actions requiring Giant prior to calling kern_link().
|
#
694b21cf |
|
21-Jun-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Acquire Giant in link() so that we can mark it as MSTD in syscalls.master. Don't want to do it in kern_link() since the Linux emulation code calls kern_link() after performing other actions requiring Giant.
|
#
d7086f31 |
|
19-Jun-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Only initialize f_data and f_ops if nobody else did so already.
|
#
1930e303 |
|
11-Jun-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Deorbit COMPAT_SUNOS. We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither a sparc32 port nor a SunOS4.x compatibility desire these days.
|
#
79db0f1c |
|
06-Jun-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Remove unused code. Submitted by: Bjoern A. Zeeb
|
#
c4d85674 |
|
04-Jun-2004 |
Tim J. Robbins <tjr@FreeBSD.org> |
Remove a stale comment.
|
#
f52e2ef2 |
|
11-May-2004 |
Tim J. Robbins <tjr@FreeBSD.org> |
Eliminate a memory leak in kern_symlink() that could occur if vn_start_write() failed.
|
#
6c0ad4a7 |
|
26-Apr-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Always use nd.ni_vp->v_mount as an argument for VFS_QUOTACTL(), just like in RELENG_4. Pointed out by: Alex Lyashkov <umka@sevinter.net>
|
#
0c0c597f |
|
22-Apr-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Look out! vn_start_write() is able to return 0 and NULL 'mp'. Submitted by: Alex Lyashkov <shadow@psoft.net>
|
#
295ed752 |
|
06-Apr-2004 |
Bruce Evans <bde@FreeBSD.org> |
Removed some less than useful comments: - don't say what a small subset of the options includes are for. - don't mark up functions which use all their args with /* ARGSUSED */. The markup should have been removed when the unused retval parameter was removed. - don't comment on what routine suser() checks do. Removed nearby excessive vertical whitespace.
|
#
7f8a436f |
|
05-Apr-2004 |
Warner Losh <imp@FreeBSD.org> |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core
|
#
0b0a60fb |
|
05-Apr-2004 |
Doug Rabson <dfr@FreeBSD.org> |
Add lgetfh(2) which is like getfh(2) but doesn't follow symlinks.
|
#
31c7e8b0 |
|
16-Mar-2004 |
David Malone <dwmalone@FreeBSD.org> |
Nudge Giant as far as I can into kern_open(). Mark open() as MPSAFE. Use kern_open() to implement creat() rather than taking the long route through open(). Mark creat as MPSAFE. While I'm at it, mark nosys() (syscall 0) as MPSAFE, for all the difference it will make.
|
#
dd604e26 |
|
08-Mar-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Add two new sysctls: - security.bsd.hardlink_check_uid, when set, means, that unprivileged users are not permitted to create hard links to files not owned by them, - security.bsd.hardlink_check_gid, when set, means, that unprivileged users are not permitted to create hard links to files owned by group they don't belong to. OK'ed by: rwatson
|
#
a1cc6206 |
|
16-Feb-2004 |
David Malone <dwmalone@FreeBSD.org> |
Correct a comment. Reviewed by: alfred, tanimura
|
#
f08df373 |
|
14-Feb-2004 |
Robert Watson <rwatson@FreeBSD.org> |
By default, when a process in jail calls getfsstat(), only return the data for the file system on which the jail's root vnode is located. Previous behavior (show data for all mountpoints) can be restored by setting security.jail.getfsstatroot_only to 0. Note: this also has the effect of hiding other mounts inside a jail, such as /dev, /tmp, and /proc, but errs on the side of leaking less information.
|
#
a2fe44e8 |
|
15-Jan-2004 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
New file descriptor allocation code, derived from similar code introduced in OpenBSD by Niels Provos. The patch introduces a bitmap of allocated file descriptors which is used to locate available descriptors when a new one is needed. It also moves the task of growing the file descriptor table out of fdalloc(), reducing complexity in both fdalloc() and do_dup(). Debts of gratitude are owed to tjr@ (who provided the original patch on which this work is based), grog@ (for the gdb(4) man page) and rwatson@ (for assistance with pxeboot(8)).
|
#
05c3c5c8 |
|
11-Jan-2004 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Mechanical whitespace cleanup; parenthesize return values; other minor style nits. The #ifdefs in this file give me a headache...
|
#
69546b2f |
|
24-Dec-2003 |
Robert Watson <rwatson@FreeBSD.org> |
Document that when we are addressing an open()/close() race, the reason we call vn_close() manually rather than letting fdrop() take care of it is that we haven't yet hooked up the various 'struct file' fields.
|
#
fde81c7d |
|
12-Nov-2003 |
Kirk McKusick <mckusick@FreeBSD.org> |
Update the statfs structure with 64-bit fields to allow accurate reporting of multi-terabyte filesystem sizes. You should build and boot a new kernel BEFORE doing a `make world' as the new kernel will know about binaries using the old statfs structure, but an old kernel will not know about the new system calls that support the new statfs structure. Running an old kernel after a `make world' will cause programs such as `df' that do a statfs system call to fail with a bad system call. Reviewed by: Bruce Evans <bde@zeta.org.au> Reviewed by: Tim Robbins <tjr@freebsd.org> Reviewed by: Julian Elischer <julian@elischer.org> Reviewed by: the hoards of <arch@freebsd.org> Sponsored by: DARPA & NAI Labs.
|
#
e1419c08 |
|
19-Oct-2003 |
David Malone <dwmalone@FreeBSD.org> |
falloc allocates a file structure and adds it to the file descriptor table, acquiring the necessary locks as it works. It usually returns two references to the new descriptor: one in the descriptor table and one via a pointer argument. As falloc releases the FILEDESC lock before returning, there is a potential for a process to close the reference in the file descriptor table before falloc's caller gets to use the file. I don't think this can happen in practice at the moment, because Giant indirectly protects closes. To stop the file being completly closed in this situation, this change makes falloc set the refcount to two when both references are returned. This makes life easier for several of falloc's callers, because the first thing they previously did was grab an extra reference on the file. Reviewed by: iedowse Idea run past: jhb
|
#
c096756c |
|
21-Aug-2003 |
Robert Watson <rwatson@FreeBSD.org> |
Add mac_check_vnode_deleteextattr() and mac_check_vnode_listextattr(): explicit access control checks to delete and list extended attributes on a vnode, rather than implicitly combining with the setextattr and getextattr checks. This reflects EA API changes in the kernel made recently, including the move to explicit VOP's for both of these operations. Obtained from: TrustedBSD PRoject Sponsored by: DARPA, Network Associates Laboratories
|
#
b2db7dc6 |
|
07-Aug-2003 |
John Baldwin <jhb@FreeBSD.org> |
td_dupfd just needs to be less than 0, it does not have to hold the negative value of the index of the new file, so just use -1.
|
#
76bd2355 |
|
04-Aug-2003 |
Ian Dowse <iedowse@FreeBSD.org> |
In the mknod(), mkfifo(), link(), symlink() and undelete() syscalls, use vrele() instead of vput() on the parent directory vnode returned by namei() in the case where it is equal to the target vnode. This handles namei()'s somewhat strange (but documented) behaviour of not locking either vnode when the two vnodes are equal and LOCKPARENT but not LOCKLEAF is specified. Note that since a vnode double-unlock is not currently fatal, these coding errors were effectively harmless. Spotted by: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de> Reviewed by: mckusick
|
#
9080ff25 |
|
28-Jul-2003 |
Robert Watson <rwatson@FreeBSD.org> |
Rename VOP_RMEXTATTR() to VOP_DELETEEXTATTR() for consistency with the kernel ACL interfaces and system call names. Break out UFS2 and FFS extattr delete and list vnode operations from setextattr and getextattr to deleteextattr and listextattr, which cleans up the implementations, and makes the results more readable, and makes the APIs more clear. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
cf774299 |
|
27-Jul-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Pass the file descriptor index down to vn_open. If the method vector was replaced and we got the "special return code" smile and trust that whatever happened below DTRT.
|
#
7c89f162 |
|
27-Jul-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout.
|
#
a8d43c90 |
|
26-Jul-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a "int fd" argument to VOP_OPEN() which in the future will contain the filedescriptor number on opens from userland. The index is used rather than a "struct file *" since it conveys a bit more information, which may be useful to in particular fdescfs and /dev/fd/* For now pass -1 all over the place.
|
#
1226914c |
|
03-Jul-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use the f_vnode field to tell which file descriptors have a vnode.
|
#
6b42f0a2 |
|
22-Jun-2003 |
Robert Watson <rwatson@FreeBSD.org> |
Prefer the vop_rmextattr() vnode operation for removing extended attributes from objects over vop_setextattr() with a NULL uio; if the file system doesn't support the vop_rmextattr() method, fall back to the vop_setextattr() method. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
3b6d9652 |
|
22-Jun-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a f_vnode field to struct file. Several of the subtypes have an associated vnode which is used for stuff like the f*() functions. By giving the vnode a speparate field, a number of checks for the specific subtype can be replaced simply with a check for f_vnode != NULL, and we can later free f_data up to subtype specific use. At this point in time, f_data still points to the vnode, so any code I might have overlooked will still work.
|
#
eaaca5de |
|
20-Jun-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't (re)initialize f_gcflag to zero. Move initialization of DTYPE_VNODE specific field f_seqcount into the DTYPE_VNODE specific code.
|
#
6084b6c9 |
|
18-Jun-2003 |
Don Lewis <truckman@FreeBSD.org> |
FILE_LOCK() uses a pool mutex, as does the vnode v_vnlock. Since pool mutexes are supposed to only be used as leaf mutexes, and what appear to be separate pool mutexes could be aliased together, it is bad idea for a thread to attempt to hold two pool mutexes at the same time. Slightly rearrange the code in kern_open() so that FILE_UNLOCK() is called before calling VOP_GETVOBJECT(), which will grab the v_vnlock mutex.
|
#
2db4b023 |
|
18-Jun-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Introduce a new flag on a file descriptor: DFLAG_SEEKABLE and use that rather than assume that only DTYPE_VNODE is seekable.
|
#
677b542e |
|
10-Jun-2003 |
David E. O'Brien <obrien@FreeBSD.org> |
Use __FBSDID().
|
#
77762179 |
|
04-Jun-2003 |
Robert Watson <rwatson@FreeBSD.org> |
If a system call comes in requesting to retrieve an attribute named "", temporarily map it to a call to extattr_list_vp() to provide compatibility for older applications using the "" API to retrieve EA lists. Use VOP_LISTEXTATTR() to support extattr_list_vp() rather than VOP_GETEXTATTR(..., "", ...). Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Asssociates Laboratories
|
#
8bebbb1a |
|
03-Jun-2003 |
Robert Watson <rwatson@FreeBSD.org> |
Implementations of extattr_list_fd(), extattr_list_file(), and extattr_list_link() system calls, which return a least of extended attributes defined for a vnode referenced by a file descriptor or path name. Currently, we just invoke VOP_GETEXTATTR() since it will convert a request for an empty name into a query for a name list, which was the old (more hackish) API. At some point in the near future, we'll push the distinction between get and list down to the vnode operation layer, but this provides access to the new API for applications in the short term. Pointed out by: Dominic Giampaolo <dbg@apple.com> Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
67096659 |
|
31-May-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove unused variable(s). Found by: FlexeLint
|
#
104a9b7e |
|
29-Apr-2003 |
Alexander Kabaev <kan@FreeBSD.org> |
Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h> Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
|
#
b6e48e03 |
|
23-Apr-2003 |
Alan Cox <alc@FreeBSD.org> |
- Acquire the vm_object's lock when performing vm_object_page_clean(). - Add a parameter to vm_pageout_flush() that tells vm_pageout_flush() whether its caller has locked the vm_object. (This is a temporary measure to bootstrap vm_object locking.)
|
#
fd7a8150 |
|
08-Apr-2003 |
Mike Barcroft <mike@FreeBSD.org> |
o In struct prison, add an allprison linked list of prisons (protected by allprison_mtx), a unique prison/jail identifier field, two path fields (pr_path for reporting and pr_root vnode instance) to store the chroot() point of each jail. o Add jail_attach(2) to allow a process to bind to an existing jail. o Add change_root() to perform the chroot operation on a specified vnode. o Generalize change_dir() to accept a vnode, and move namei() calls to callers of change_dir(). o Add a new sysctl (security.jail.list) which is a group of struct xprison instances that represent a snapshot of active jails. Reviewed by: rwatson, tjr
|
#
a184d471 |
|
05-Mar-2003 |
Robert Watson <rwatson@FreeBSD.org> |
Move the initialization of the vattr flags field in setfflags() to before the MAC check so that we pass the flags field into the MAC check properly initialized. This didn't affect any current MAC modules since they didn't care what the flags argument was (as they were primarily interested in the fact that it was a meta-data write, not the contents of the write), but would be relevant to future modules relying on that field. Submitted by: Mike Halderman <mrh@spawar.navy.mil> Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
a163d034 |
|
18-Feb-2003 |
Warner Losh <imp@FreeBSD.org> |
Back out M_* changes, per decision of the TRB. Approved by: trb
|
#
a44009e0 |
|
15-Feb-2003 |
Jeffrey Hsu <hsu@FreeBSD.org> |
Remove extraneous FILEDESC_LOCK around atomic read.
|
#
565211b2 |
|
31-Jan-2003 |
Robert Watson <rwatson@FreeBSD.org> |
Correct handling of locking for chroot() and chdir() cases: rather than having change_dir() release the vnode lock on success, hold the lock so that we can use it later when invoking MAC checks and VOP_ACCESS() in the chroot() code. Update the comment to reflect this calling convention. Update callers to unlock the vnode lock. Correct a typo regarding vnode naming in the MAC case that crept in via the previous patch applied.
|
#
7278944d |
|
31-Jan-2003 |
Robert Watson <rwatson@FreeBSD.org> |
Clean up vnode handling on return from chroot() in certain error cases: we might multiply vrele() a vnode when certain classes of failures occur. This appears to stem from earlier Giant/file descriptor lock pushdown and restructuring. Submitted by: maxim
|
#
44956c98 |
|
21-Jan-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
#
48e3128b |
|
12-Jan-2003 |
Matthew Dillon <dillon@FreeBSD.org> |
Bow to the whining masses and change a union back into void *. Retain removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.
|
#
cd72f218 |
|
11-Jan-2003 |
Matthew Dillon <dillon@FreeBSD.org> |
Change struct file f_data to un_data, a union of the correct struct pointer types, and remove a huge number of casts from code using it. Change struct xfile xf_data to xun_data (ABI is still compatible). If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.
|
#
f0c09328 |
|
06-Jan-2003 |
Jacques Vidrine <nectar@FreeBSD.org> |
Correct file descriptor leaks in lseek and do_dup. The leak in lseek was introduced in vfs_syscalls.c revision 1.218. The leak in do_dup was introduced in kern_descrip.c revision 1.158. Submitted by: iedowse
|
#
f97182ac |
|
14-Dec-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
unwrap lines made short enough by SCARGS removal
|
#
b80521fe |
|
13-Dec-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
remove syscallarg(). Suggested by: peter
|
#
d1e405c5 |
|
13-Dec-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
SCARGS removal take II.
|
#
bc9e75d7 |
|
13-Dec-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Backout removal SCARGS, the code freeze is only "selectively" over.
|
#
0bbe7292 |
|
13-Dec-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove SCARGS. Reviewed by: md5
|
#
4e08ccb2 |
|
27-Oct-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
Fix a case in kern_rename() where a vn_finished_write() call was missed. This bug has been present since the vn_start_write() and vn_finished_write() calls were first added in revision 1.159. When the case is triggered, any attempts to create snapshots on the filesystem will deadlock and also prevent further write activity on that filesystem.
|
#
c7047e52 |
|
27-Oct-2002 |
Garrett Wollman <wollman@FreeBSD.org> |
Change the way support for asynchronous I/O is indicated to applications to conform to 1003.1-2001. Make it possible for applications to actually tell whether or not asynchronous I/O is supported. Since FreeBSD's aio implementation works on all descriptor types, don't call down into file or vnode ops when [f]pathconf() is asked about _PC_ASYNC_IO; this avoids the need for every file and vnode op to know about it.
|
#
7587203c |
|
19-Oct-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Hook up most of the MAC entry points relating to file/directory/node creation, deletion, and rename. There are one or two other stray cases I'll catch in follow-up commits (such as unix domain socket creation); this permits MAC policy modules to limit the ability to perform these operations based on existing UNIX credential / vnode attributes, extended attributes, and security labels. In the rename case using MAC, we now have to lock the from directory and file vnodes for the MAC check, but this is done only in the MAC case, and the locks are immediately released so that the remainder of the rename implementation remains the same. Because the create check takes a vattr to know object type information, we now initialize additional fields in the VATTR passed to VOP_SYMLINK() in the MAC case. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
2dba710d |
|
10-Oct-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Incremental style improvements: more consistently avoid assignments in conditionals; remove some excess vertical whitespace; remove a bug in the return handling of the delete_vp() case for MAC. Spotted by: bde
|
#
b101411b |
|
09-Oct-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Explore new heights in alphabetization for _file and _fd variations on the extended attribute system calls.
|
#
6f90723c |
|
09-Oct-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Implement extattr_{delete,get,set}_link() system calls: extended attribute operations that do not follow links. Sync to MAC tree. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
197b023b |
|
07-Oct-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
Add back a fdrop() call at the end of kern_open() that got lost in revision 1.218. This bug caused a "struct file" reference to be leaked if VOP_ADVLOCK(), vn_start_write(), or mac_check_vnode_write() failed during the open operation. PR: kern/43739 Reported by: Arne Woerner <woerner@mediabase-gmbh.de>
|
#
0a694196 |
|
05-Oct-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Merge support for mac_check_vnode_link(), a MAC framework/policy entry point that instruments the creation of hard links. Policy implementations to follow. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
8c5d0137 |
|
02-Oct-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix mis-indentation. Spotted by: FlexeLint
|
#
d40a8125 |
|
24-Sep-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
- Properly lock v_vflags in getdirents().
|
#
fa288043 |
|
19-Sep-2002 |
Don Lewis <truckman@FreeBSD.org> |
VOP_FSYNC() requires that it's vnode argument be locked, which nfs_link() wasn't doing. Rather than just lock and unlock the vnode around the call to VOP_FSYNC(), implement rwatson's suggestion to lock the file vnode in kern_link() before calling VOP_LINK(), since the other filesystems also locked the file vnode right away in their link methods. Remove the locking and and unlocking from the leaf filesystem link methods. Reviewed by: rwatson, bde (except for the unionfs_link() changes)
|
#
d3a7b5e7 |
|
10-Sep-2002 |
Bruce Evans <bde@FreeBSD.org> |
vfs_syscalls.c: Changed rename(2) to follow the letter of the POSIX spec. POSIX requires rename() to have no effect if its args "resolve to the same existing file". I think "file" can only reasonably be read as referring to the inode, although the rationale and "resolve" seem to say that sameness is at the level of (resolved) directory entries. ext2fs_vnops.c, ufs_vnops.c: Replaced code that gave the historical BSD behaviour of removing one link name by checks that this code is now unreachable. This fixes some races. All vnodes needed to be unlocked for the removal, and locking at another level using something like IN_RENAME was not even attempted, so it was possible for rename(x, y) to return with both x and y removed even without any unlink(2) syscalls (one process can remove x using rename(x, y) and another process can remove y using rename(y, x)). Prodded by: alfred MFC after: 8 weeks PR: 42617
|
#
8f19eb88 |
|
01-Sep-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
Split out a number of mostly VFS and signal related syscalls into a kernel-internal kern_*() version and a wrapper that is called via the syscall vector table. For paths and structure pointers, the internal version either takes a uio_seg parameter or requires the caller to copyin() the data to kernel memory as appropiate. This will permit emulation layers to use these syscalls without having to copy out translated arguments to the stack gap. Discussed on: -arch Review/suggestions: bde, jhb, peter, marcel
|
#
856d3a05 |
|
20-Aug-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
- Hold the vnode lock across unlink() so that the v_vflag check is safe. - Fix the long broken error handling for VV_ROOT and VDIR.
|
#
177142e4 |
|
19-Aug-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Pass active_cred and file_cred into the MAC framework explicitly for mac_check_vnode_{poll,read,stat,write}(). Pass in fp->f_cred when calling these checks with a struct file available. Otherwise, pass NOCRED. All currently MAC policies use active_cred, but could now offer the cached credential semantic used for the base system security model. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
7f724f8b |
|
19-Aug-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Break out mac_check_vnode_op() into three seperate checks: mac_check_vnode_poll(), mac_check_vnode_read(), mac_check_vnode_write(). This improves the consistency with other existing vnode checks, and allows policies to avoid implementing switch statements to determine what operations they do and do not want to authorize. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
ea6027a8 |
|
15-Aug-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Make similar changes to fo_stat() and fo_poll() as made earlier to fo_read() and fo_write(): explicitly use the cred argument to fo_poll() as "active_cred" using the passed file descriptor's f_cred reference to provide access to the file credential. Add an active_cred argument to fo_stat() so that implementers have access to the active credential as well as the file credential. Generally modify callers of fo_stat() to pass in td->td_ucred rather than fp->f_cred, which was redundantly provided via the fp argument. This set of modifications also permits threads to perform these operations on behalf of another thread without modifying their credential. Trickle this change down into fo_stat/poll() implementations: - badfo_poll(), badfo_stat(): modify/add arguments. - kqueue_poll(), kqueue_stat(): modify arguments. - pipe_poll(), pipe_stat(): modify/add arguments, pass active_cred to MAC checks rather than td->td_ucred. - soo_poll(), soo_stat(): modify/add arguments, pass fp->f_cred rather than cred to pru_sopoll() to maintain current semantics. - sopoll(): moidfy arguments. - vn_poll(), vn_statfile(): modify/add arguments, pass new arguments to vn_stat(). Pass active_cred to MAC and fp->f_cred to VOP_POLL() to maintian current semantics. - vn_close(): rename cred to file_cred to reflect reality while I'm here. - vn_stat(): Add active_cred and file_cred arguments to vn_stat() and consumers so that this distinction is maintained at the VFS as well as 'struct file' layer. Pass active_cred instead of td->td_ucred to MAC and to VOP_GETATTR() to maintain current semantics. - fifofs: modify the creation of a "filetemp" so that the file credential is properly initialized and can be used in the socket code if desired. Pass ap->a_td->td_ucred as the active credential to soo_poll(). If we teach the vnop interface about the distinction between file and active credentials, we would use the active credential here. Note that current inconsistent passing of active_cred vs. file_cred to VOP's is maintained. It's not clear why GETATTR would be authorized using active_cred while POLL would be authorized using file_cred at the file system level. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
e6e370a7 |
|
04-Aug-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
|
#
18b770b2 |
|
01-Aug-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Introduce support for Mandatory Access Control and extensible kernel access control. Invoke appropriate MAC framework entry points to authorize readdir() operations in the native ABI. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
f9d0d524 |
|
01-Aug-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Include file cleanup; mac.h and malloc.h at one point had ordering relationship requirements, and no longer do. Reminded by: bde
|
#
f4d2cfdd |
|
01-Aug-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Introduce support for Mandatory Access Control and extensible kernel access control. Invoke appropriate MAC entry points to authorize the following operations: truncate on open() (write) access() (access) readlink() (readlink) chflags(), lchflags(), fchflags() (setflag) chmod(), fchmod(), lchmod() (setmode) chown(), fchown(), lchown() (setowner) utimes(), lutimes(), futimes() (setutimes) truncate(), ftrunfcate() (write) revoke() (revoke) fhopen() (open) truncate on fhopen() (write) extattr_set_fd, extattr_set_file() (setextattr) extattr_get_fd, extattr_get_file() (getextattr) extattr_delete_fd(), extattr_delete_file() (setextattr) These entry points permit MAC policies to enforce a variety of protections on vnodes. More vnode checks to come, especially in non-native ABIs. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
b3e13e1c |
|
31-Jul-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Introduce support for Mandatory Access Control and extensible kernel access control. Instrument chdir() and chroot()-related system calls to invoke appropriate MAC entry points to authorize the two operations. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
b285e7f9 |
|
31-Jul-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Improve formatting and variable use consistency in extattr system calls. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
956fc3f8 |
|
31-Jul-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Simplify the logic to enter VFS_EXTATTRCTL(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
2712d0ee |
|
30-Jul-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Introduce support for Mandatory Access Control and extensible kernel access control. Implement MAC framework access control entry points relating to operations on mountpoints. Currently, this consists only of access control on mountpoint listing using the various statfs() variations. In the future, it might also be desirable to implement checks on mount() and unmount(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
e66c87b7 |
|
30-Jul-2002 |
Robert Watson <rwatson@FreeBSD.org> |
When referencing nd_cnp after namei(), always pass SAVENAME into NDINIT() operation flags. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
0b1040cb |
|
21-Jul-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Set VAPPEND in open mode when O_APPEND is specified as an argument to open() of fhopen(). Currently this has no actual affect due to the treatment of VAPPEND in vaccess() and vaccess_acl() as a subset of VWRITE, but when MAC comes in, MAC will distinguish the two. Note: if any file systems are cutting their own permission models, they may wish to now take this into account. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
fb36a3d8 |
|
16-Jul-2002 |
Kirk McKusick <mckusick@FreeBSD.org> |
Change utimes to set the file creation time (for filesystems that support creation times such as UFS2) to the value of the modification time if the value of the modification time is older than the current creation time. See utimes(2) for further details. Sponsored by: DARPA & NAI Labs.
|
#
faab4e27 |
|
16-Jul-2002 |
Kirk McKusick <mckusick@FreeBSD.org> |
Change the name of st_createtime to st_birthtime. This change is made to reduce confusion between st_ctime and st_createtime. Submitted by: Eric Allman <eric@sendmail.org> Sponsored by: DARPA & NAI Labs.
|
#
03d7a9ff |
|
12-Jul-2002 |
John Baldwin <jhb@FreeBSD.org> |
- Change chroot_refuse_vdir_fds() to require that the passed in struct filedesc is already locked rather than having chroot() unlock the filedesc so chroot_refuse_vdir_fds() can immediately relock it. - Reorder chroot() a bitso that we do the namei lookup before checking the process's struct filedesc. This closes at least one potential race and allows us to only acquire the filedsec lock once in chroot(). - Push down Giant slightly into chroot().
|
#
2b4edb69 |
|
02-Jul-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Move every code related to mount(2) in a new file, vfs_mount.c. The file vfs_conf.c which was dealing with root mounting has been repo-copied into vfs_mount.c to preserve history. This makes nmount related development easier, and help reducing the size of vfs_syscalls.c, which is still an enormous file. Reviewed by: rwatson Repo-copy by: peter
|
#
6bd521df |
|
01-Jul-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
Use indirect function pointer hooks instead of #ifdef SOFTUPDATES direct calls for the two places where the kernel calls into soft updates code. Set up the hooks in softdep_initialize() and NULL them out in softdep_uninitialize(). This change allows soft updates to function correctly when ufs is loaded as a module. Reviewed by: mckusick
|
#
a7884425 |
|
28-Jun-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove unneeded casts to caddr_t.
|
#
84b2995b |
|
28-Jun-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
In vn_mkdir(), use vrele() instead of vput() on the parent directory vnode in the case that the target exists and is the same vnode as the parent (i.e. "mkdir ."). The namei() call does not leave the vnode locked in this case even though you might expect it to. This bug was mostly harmless in practice because unlocking an already unlocked vnode currently does not trigger any panics or warnings. Reviewed by: jeff
|
#
d374764f |
|
24-Jun-2002 |
Kirk McKusick <mckusick@FreeBSD.org> |
Use proper size in bzero of stat structure. Submitted by: Jake Burkholder <jake@locore.ca> Sponsored by: DARPA & NAI Labs.
|
#
6524dddc |
|
22-Jun-2002 |
Kirk McKusick <mckusick@FreeBSD.org> |
This patch fixes a size problem with the stat structure for 64-bit architectures that was introduced in the UFS2 code merge two days ago. The stat structure change that caused the problem was the addition of the file create time. Submitted by: Bruce Evans <bde@zeta.org.au> Sponsored by: DARPA & NAI Labs.
|
#
cacd1c9b |
|
22-Jun-2002 |
Maxime Henrion <mux@FreeBSD.org> |
o Remove the initialization of unused fields in the struct uio now that we don't use uiomove() anymore. o Enforce stricter checks on the length of the iov's in nmount(2) since we now malloc() them individually and corrupted iov's could make the kernel crash in malloc() with "kmem_map too small". Reviewed by: phk
|
#
1c85e6a3 |
|
21-Jun-2002 |
Kirk McKusick <mckusick@FreeBSD.org> |
This commit adds basic support for the UFS2 filesystem. The UFS2 filesystem expands the inode to 256 bytes to make space for 64-bit block pointers. It also adds a file-creation time field, an ability to use jumbo blocks per inode to allow extent like pointer density, and space for extended attributes (up to twice the filesystem block size worth of attributes, e.g., on a 16K filesystem, there is space for 32K of attributes). UFS2 fully supports and runs existing UFS1 filesystems. New filesystems built using newfs can be built in either UFS1 or UFS2 format using the -O option. In this commit UFS1 is the default format, so if you want to build UFS2 format filesystems, you must specify -O 2. This default will be changed to UFS2 when UFS2 proves itself to be stable. In this commit the boot code for reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c) as there is insufficient space in the boot block. Once the size of the boot block is increased, this code can be defined. Things to note: the definition of SBSIZE has changed to SBLOCKSIZE. The header file <ufs/ufs/dinode.h> must be included before <ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and ufs_lbn_t. Still TODO: Verify that the first level bootstraps work for all the architectures. Convert the utility ffsinfo to understand UFS2 and test growfs. Add support for the extended attribute storage. Update soft updates to ensure integrity of extended attribute storage. Switch the current extended attribute interfaces to use the extended attribute storage. Add the extent like functionality (framework is there, but is currently never used). Sponsored by: DARPA & NAI Labs. Reviewed by: Poul-Henning Kamp <phk@freebsd.org>
|
#
7d2d4409 |
|
20-Jun-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Change the way we internally store the mount options to a linked list. This is to allow the merging of the mount options in the MNT_UPDATE case, as the current data structure is unsuitable for this. There are no functional differences in this commit. Reviewed by: phk
|
#
8eb0098f |
|
28-May-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Remove a duplicated vfs_freeopts() that I introduced in last revision.
|
#
2274ec99 |
|
23-May-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Style nit, no functional changes.
|
#
9ee6bf71 |
|
23-May-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Slightly change the way we pass mount options to the filesystem VFS_NMOUNT operations. Reviewed by: phk
|
#
e9e705b0 |
|
20-May-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Change two vput() that should have been vrele(). Submitted by: iedowse
|
#
d394511d |
|
16-May-2002 |
Tom Rhodes <trhodes@FreeBSD.org> |
More s/file system/filesystem/g
|
#
0e2d6cc8 |
|
14-May-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
Disable the shared locking namei() code for now. It breaks several stacking filesystems. This is on hold until the rest of VFS Locking is reviewed and deemed safe. It can be enabled with 'options LOOKUP_SHARED'.
|
#
9d997d8b |
|
05-May-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Add the lchflags(2) syscall. Reviewed by: rwatson
|
#
576365ba |
|
05-May-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
Move a KASSERT() in open() prior to unlocking the vnode. It's not safe to call VOP_GETVOBJECT without a lock.
|
#
afd458b0 |
|
04-May-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Fix a typo. Submitted by: dwmalone
|
#
7a0776e4 |
|
22-Apr-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Slightly restructure extattr_get_vp() so that there's only one entry point to VOP_GETEXTATTR(). This simplifies code flow when inserting MAC hooks. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
05103170 |
|
19-Apr-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Improve style consistency of vfs_syscalls.c by converting the style used in various extattr_*() calls to match the rest of the file. Originally, these bits at the end looked more like style(9). This patch was submitted by green by way of the TrustedBSD MAC tree, and I fixed a few problems with it on the way through. Someone with more time on their hands should convert the entire file to style(9); this commit is for diff reduction purposes. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
df99ca52 |
|
16-Apr-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
The recent NFS forced unmount improvements introduced a side-effect where some client operations might be unexpectedly cancelled during an unsuccessful non-forced unmount attempt. This causes problems for amd(8), because it periodically attempts a non-forced unmount to check if the filesystem is still in use. Fix this by adding a new mountpoint flag MNTK_UNMOUNTF that is set only during the operation of a forced unmount. Use this instead of MNTK_UNMOUNT to trigger the cancellation of hung NFS operations. Also correct a problem where dounmount() might inadvertently clear the MNTK_UNMOUNT flag. Reported by: simokawa MFC after: 1 week
|
#
a59f8b9e |
|
08-Apr-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
Turn #ifdef LOOKUP_SHARED into #ifndef LOOKUP_EXCLUSIVE to enable this behavior by default. Also, change the options line to reflect this. If there are no problems reported this will become the only behavior and the knob will be removed in a month or so. Demanded by: obrien
|
#
a48ca369 |
|
08-Apr-2002 |
Maxime Henrion <mux@FreeBSD.org> |
The fourth parameter to copystr() is a size_t, not an int. Approved by: peter
|
#
9d835373 |
|
07-Apr-2002 |
Maxime Henrion <mux@FreeBSD.org> |
o Change kernel_vmount() interface to be more convenient : pass two separate strings instead of passing "foo=bar". o Don't forget to clear the VMOUNT flag on the vnode when vfs_nmount() fails because the fs doesn't implement VFS_NMOUNT (and in vfs_mount() when the fs doesn't implement VFS_MOUNT) ; also decrement the vfs refcount in the !MNT_UPDATE case.
|
#
bcc93175 |
|
02-Apr-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Add two forgotten vfs_unbusy() calls, in vfs_mount() and vfs_nmount(). Reviewed by: phk
|
#
44731cab |
|
01-Apr-2002 |
John Baldwin <jhb@FreeBSD.org> |
Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@
|
#
daab5e24 |
|
28-Mar-2002 |
Maxime Henrion <mux@FreeBSD.org> |
- Properly sync vfs_nmount() with changes that have be already done in vfs_mount(), in particular revisions 1.215, 1.227 and 1.240. - flag2 is a low quality variable name, change it to kern_flag. - strncpy NUL-terminates f_fstypename and f_mntonname since the strings have length <= <buffer length> - 1, so the explicit NUL-termination is bogus. - M_ZERO'ing space for fstype and fspath is stupid since we never use the space beyond the end of the string. - Do various style(9) cleanups in both functions. Submitted by: bde Reviewed by: phk
|
#
dcce8874 |
|
26-Mar-2002 |
Andrew R. Reiter <arr@FreeBSD.org> |
- Fixup a few style nits: - return error -> return (error); - move a declaration to the top of the function. - become bug for bug compatible with if (error) lines. Submitted by: bde
|
#
17594b93 |
|
26-Mar-2002 |
Maxime Henrion <mux@FreeBSD.org> |
As discussed in -arch, add the new nmount(2) system call and the new vfs_getopt()/vfs_copyopt() API. This is intended to be used later, when there will be filesystems implementing the VFS_NMOUNT operation. The mount(2) system call will disappear when all filesystems will be converted to the new API. Documentation will be committed in a while. Reviewed by: phk
|
#
517f30c2 |
|
25-Mar-2002 |
Andrew R. Reiter <arr@FreeBSD.org> |
- Recommit the securelevel_gt() calls removed by commits rev. 1.84 of kern_linker.c and rev. 1.237 of vfs_syscalls.c since these are not the source of the recent panics occuring around kldloading file system support modules. Requested by: rwatson
|
#
fe3240e9 |
|
21-Mar-2002 |
Andrew R. Reiter <arr@FreeBSD.org> |
- Back out the commit to make the linker_load_file() securelevel check made aware in jail environments. Supposedly something is broken, so this should be backed out until further investigation proves otherwise, or a proper fix can be provided.
|
#
e85b9ae9 |
|
21-Mar-2002 |
Andrew R. Reiter <arr@FreeBSD.org> |
- Fix a logic error in checking the securelevel that was introduced in the previous commit. Pointy hats to: arr, rwatson
|
#
c457a440 |
|
20-Mar-2002 |
Andrew R. Reiter <arr@FreeBSD.org> |
- Change a check of securelevel to securelevel_gt() call in order to help against users within a jail attempting to load kernel modules. - Add a check of securelevel_gt() to vfs_mount() in order to chop some low hanging fruit for the repair of securelevel checking of linking and unlinking files from within jails. There is more to be done here. Reviewed by: rwatson
|
#
c897b813 |
|
19-Mar-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove references to vm_zone.h and switch over to the new uma API. Also, remove maxsockets. If you look carefully you'll notice that the old zone allocator never honored this anyway.
|
#
4d77a549 |
|
19-Mar-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove __P.
|
#
4a950215 |
|
18-Mar-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Close a race when vfs_syscalls.c:checkdirs() runs. To do this protect the filedesc pointer in the proc with PROC_LOCK in both checkdirs() and kern_descrip.c:fdfree().
|
#
8de00f4a |
|
11-Mar-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
This patch adds the "LOCKSHARED" option to namei which causes it to only acquire shared locks on leafs. The stat() and open() calls have been changed to make use of this new functionality. Using shared locks in these cases is sufficient and can significantly reduce their latency if IO is pending to these vnodes. Also, this reduces the number of exclusive locks that are floating around in the system, which helps reduce the number of deadlocks that occur. A new kernel option "LOOKUP_SHARED" has been added. It defaults to off so this patch can be turned on for testing, and should eventually go away once it is proven to be stable. I have personally been running this patch for over a year now, so it is believed to be fully stable. Reviewed by: jake, obrien Approved by: jake
|
#
89e1164e |
|
05-Mar-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Three p_ucred -> td_ucred's missed in jhb's earlier pass; all appear to be safe.
|
#
b0ad6e20 |
|
05-Mar-2002 |
Robert Watson <rwatson@FreeBSD.org> |
The change from td->td_proc->p_ucred to td->td_ucred has shortened some lines: more agressively line wrap under those circumstances.
|
#
bdd67d48 |
|
27-Feb-2002 |
John Baldwin <jhb@FreeBSD.org> |
- Change namei() to use td_ucred instead of p_ucred. - Change the hack in access() that uses a temporary credential to set td_ucred to the temp cred instead of p_ucred.
|
#
a854ed98 |
|
27-Feb-2002 |
John Baldwin <jhb@FreeBSD.org> |
Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.
|
#
1ea030d8 |
|
10-Feb-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Make sure to hold vnode lock when calling into VOP_GETATTR(). Discussed with: mckusick, phk
|
#
c0a9dc83 |
|
10-Feb-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Make sure to grab vnode lock on a vnode before calling VOP_GETATTR() to perform an ownership test in revoke(). This is also required for MAC hooks so that the vnode lock is held during a call to the MAC framework. Release the lock before calling VOP_REVOKE(). Discussed with: phk, mckusick
|
#
56e04d01 |
|
09-Feb-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Remove a stray 'const' that slept into extattr_set_vp(), and could result in compiler warnings.
|
#
74237f55 |
|
09-Feb-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Part I: Update extended attribute API and ABI: o Modify the system call syntax for extattr_{get,set}_{fd,file}() so as not to use the scatter gather API (which appeared not to be used by any consumers, and be less portable), rather, accepts 'data' and 'nbytes' in the style of other simple read/write interfaces. This changes the API and ABI. o Modify system call semantics so that extattr_get_{fd,file}() return a size_t. When performing a read, the number of bytes read will be returned, unless the data pointer is NULL, in which case the number of bytes of data are returned. This changes the API only. o Modify the VOP_GETEXTATTR() vnode operation to accept a *size_t argument so as to return the size, if desirable. If set to NULL, the size will not be returned. o Update various filesystems (pseodofs, ufs) to DTRT. These changes should make extended attributes more useful and more portable. More commits to rebuild the system call files, as well as update userland utilities to follow. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
143bb598 |
|
07-Feb-2002 |
Robert Watson <rwatson@FreeBSD.org> |
o Merge various recent fixes from the MAC branch relating to extattrctl(): - Fix null-pointer dereference introduced when snapshotting was introduced. This occured because unlike the previous code, vn_start_write() doesn't always return a non-NULL mp, as filesystems may not support the VOP_GETWRITEMOUNT() call. For now, rely on two pointers, so that vn_finished_write() works properly. - Fix locking problems on exit, introduced at some past time, some when snapshots came in, where a vnode might not be unlocked before being vrele'd in various error situations. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
079b7bad |
|
07-Feb-2002 |
Julian Elischer <julian@FreeBSD.org> |
Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out. Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
|
#
b7184973 |
|
01-Feb-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Don't recurse on filedesc lock in chroot_refuse_vdir_fds(). Noticed by: Michael Nottebrock <michaelnottebrock@gmx.net>
|
#
a4db4953 |
|
13-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Replace ffind_* with fget calls. Make fget MPsafe. Make fgetvp and fgetsock use the fget subsystem to reduce code bloat. Push giant down in fpathconf().
|
#
426da3bc |
|
13-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
SMP Lock struct file, filedesc and the global file list. Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file *fp); /* increments reference count on a file */ struct file * fhold_locked(struct file *fp); /* like fhold but expects file to locked */ struct file * ffind_hold(struct thread *, int fd); /* finds the struct file in thread, adds one reference and returns it unlocked */ struct file * ffind_lock(struct thread *, int fd); /* ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.
|
#
1f493270 |
|
09-Jan-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
Change dounmount() to return EBUSY in the non-MNT_FORCE case if we can't acquire the mnt_lock without blocking. Normally non-forced unmount attempts return EBUSY quickly if any vnodes are active, so this just extends that behaviour to cover the per-mount mnt_lock too.
|
#
10cc6dff |
|
03-Jan-2002 |
Stefan Eßer <se@FreeBSD.org> |
Return EBADF in case some vnode field has been reset to a NULL pointer. (There has been some discussion, whether ENOENT or EBADF is more appropriate. I choose the latter, since the operation is not supported on the file descriptor at that time, even if it was, immediately before.) PR: 32681 Reviewed by: dillon, iedowse, ... Approved by: nectar MFC after: 3 days (pending RE approval)
|
#
751a2cd0 |
|
05-Nov-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Define a new mount flag "MNT_JAILDEVFS" Collect the magic combination of flags which can be updated into a macro in sys/mount.h rather than inlining them (twice!) in vfs_syscalls.c
|
#
6b8bd2ef |
|
04-Nov-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Add mnt_reservedvnlist so we can MFC to 4.x, in order to make all mount structure changes now rather then piecemeal later on. mnt_nvnodelist currently holds all the vnodes under the mount point. This will eventually be split into a 'dirty' and 'clean' list. This way we only break kld's once rather then twice. nvnodelist will eventually turn into the dirty list and should remain compatible with the klds.
|
#
cd778f02 |
|
02-Nov-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Remove the local temporary variable "struct proc *p" from vfs_mount() in vfs_syscalls.c. Although it did save some indirection, many of those savings will be obscured with the impending commit of suser() changes, and the result is increased code complexity. Also, once p->p_ucred and td->td_ucred are distinguished, this will make vfs_mount() use the correct thread credential, rather than the process credential.
|
#
0bd1a2d0 |
|
02-Nov-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Argh! patch added the nmount at the bottom first time around. Take 3!
|
#
bad69977 |
|
02-Nov-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add empty shell for nmount syscall (take 2!)
|
#
06d133c4 |
|
02-Nov-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add nmount() stub function and regenerate the syscall-glue which should not need to check in generated files.
|
#
a06fe511 |
|
24-Oct-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
unwind v_writecount in fhopen() if we are unable to allocate the descriptor. MFC after: 3 days
|
#
c72ccd01 |
|
22-Oct-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Change the vnode list under the mount point from a LIST to a TAILQ in preparation for an implementation of limiting code for kern.maxvnodes. MFC after: 3 days
|
#
c6ab2f6b |
|
01-Oct-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Complete the migration from suser error checking in the following form in vfs_syscalls.c: if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid && (error = suser_td(td)) != 0) { unwrap_lots_of_stuff(); return (error); } to: if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid) { error = suser_td(td); if (error) { unwrap_lots_of_stuff(); return (error); } } This makes the code more readable when complex clauses are in use, and minimizes conflicts for large outstanding patchsets modifying the kernel authorization code (of which I have several), especially where existing authorization and context code are combined in the same if() conditional. Obtained from: TrustedBSD Project
|
#
b4799065 |
|
21-Sep-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o vpaccess() -> vn_access() -- Peter reminds me that there is already a convention for vnop helper routines of this sort. Submitted by: Mr Wemm <peter>
|
#
9c94f773 |
|
21-Sep-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Introduce eaccess(2), a version of access(2) that uses the effective credentials rather than the real credentials. This is useful for implementing GUI's which need to modify icons based on access rights, but where use of open(2) is too expensive, use of stat(2) doesn't reflect the file system's real protection model, and use of access() suffers from real/effective credential confusion. This implementation provides the same semantics as the call of the same name on SCO OpenServer. Note: using this call improperly can leave you subject to some of the same races present in the access(2) call. o To implement this, break out the basic logic of access(2) into vpaccess(), which accepts a passed credential to perform the invocation of VOP_ACCESS(). Add eaccess(2) to invoke vpaccess(), and modify access(2) to use vpaccess(). Obtained from: TrustedBSD Project
|
#
b40ce416 |
|
12-Sep-2001 |
Julian Elischer <julian@FreeBSD.org> |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
|
#
63347f1e |
|
29-Aug-2001 |
Andrey A. Chernov <ache@FreeBSD.org> |
lseek: simplify overflow checks
|
#
c4778eed |
|
26-Aug-2001 |
Andrey A. Chernov <ache@FreeBSD.org> |
Cosmetique & style fixes from bde
|
#
db106eff |
|
23-Aug-2001 |
Andrey A. Chernov <ache@FreeBSD.org> |
lseek: fix check for vattr.va_size overflow. Check suggested by bde simple not works with unsigned types.
|
#
b82f5b62 |
|
23-Aug-2001 |
Andrey A. Chernov <ache@FreeBSD.org> |
Cosmetique: more <sys/*> into one group, separate include families by blank line
|
#
383f169d |
|
21-Aug-2001 |
Andrey A. Chernov <ache@FreeBSD.org> |
Make lseek() POSIXed: for non character special files 1) handle off_t overflow with EOVERFLOW 2) handle negative offsets with EINVAL Reviewed by: arch discussion
|
#
8774836b |
|
20-Aug-2001 |
Ian Dowse <iedowse@FreeBSD.org> |
Avoid sleeping while holding a mutex in dounmount(). This problem has existed for a long time, but I made it worse a few months ago by by adding calls to VFS_ROOT() and checkdirs() in revision 1.179. Also, remove the LK_REENABLE flag in the lockmgr() call; this flag has been ignored by the lockmgr code for 4 years. This was the only remaining mention of it apart from its definition. Reviewed by: jhb
|
#
a9a8ba3d |
|
10-Aug-2001 |
Ian Dowse <iedowse@FreeBSD.org> |
Arbitrarily limit to 64k the number of bytes that can be read at a time using the ogetdirentries() compatibility syscall. This is a hack to ensure that rediculous values don't get passed to MALLOC(). Reviewed by: kris
|
#
f0cc1c6f |
|
09-Jul-2001 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Constify the fstype argument to vfs_mount(). This eliminates at least one "call discards qualifier" warning (in sys/compat/linux/linux_file.c).
|
#
0cddd8f0 |
|
04-Jul-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
c0a0fb85 |
|
06-Jun-2001 |
Thomas Moestl <tmm@FreeBSD.org> |
Fix an instance of NDINIT in the extattrctl syscall: LOCKLEAF was or'ed to the operation parameter, not to the flags as it should be. Reviewed by: rwatson
|
#
b1fc0ec1 |
|
25-May-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account. Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit
|
#
bdc60f5b |
|
23-May-2001 |
John Baldwin <jhb@FreeBSD.org> |
Don't release Giant around vm_oject_page_clean() in fsync() as the pager putpages called will need Giant.
|
#
99d300a1 |
|
23-May-2001 |
Ruslan Ermilov <ru@FreeBSD.org> |
- FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file systems were repo-copied from sys/miscfs to sys/fs. - Renamed the following file systems and their modules: fdesc -> fdescfs, portal -> portalfs, union -> unionfs. - Renamed corresponding kernel options: FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS. - Install header files for the above file systems. - Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland Makefiles.
|
#
23955314 |
|
18-May-2001 |
Alfred Perlstein <alfred@FreeBSD.org> |
Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
|
#
60fb0ce3 |
|
28-Apr-2001 |
Greg Lehey <grog@FreeBSD.org> |
Revert consequences of changes to mount.h, part 2. Requested by: bde
|
#
d98dc34f |
|
23-Apr-2001 |
Greg Lehey <grog@FreeBSD.org> |
Correct #includes to work with fixed sys/mount.h.
|
#
fec605c8 |
|
31-Mar-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Introduce extattr_{delete,get,set}_fd() to allow extended attribute operations on file descriptors, which complement the existing set of calls, extattr_{delete,get,set}_file() which act on paths. In doing so, restructure the system call implementation such that the two sets of functions share most of the relevant code, rather than duplicating it. This pushes the vnode locking into the shared code, but keeps the copying in of some arguments in the system call code. Allowing access via file descriptors reduces the opportunity for race conditions when managing extended attributes. Obtained from: TrustedBSD Project
|
#
1005a129 |
|
28-Mar-2001 |
John Baldwin <jhb@FreeBSD.org> |
Convert the allproc and proctree locks from lockmgr locks to sx locks.
|
#
0abc15fd |
|
20-Mar-2001 |
Bruce Evans <bde@FreeBSD.org> |
Fixed breakage of access() in rev.1.164. Wrong credentials were used for the final path component.
|
#
30632071 |
|
18-Mar-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Rename "namespace" argument to "attrnamespace" as namespace is a C++ reserved word. Submitted by: jkh Obtained from: TrustedBSD Project
|
#
70f36851 |
|
14-Mar-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Change the API and ABI of the Extended Attribute kernel interfaces to introduce a new argument, "namespace", rather than relying on a first- character namespace indicator. This is in line with more recent thinking on EA interfaces on various mailing lists, including the posix1e, Linux acl-devel, and trustedbsd-discuss forums. Two namespaces are defined by default, EXTATTR_NAMESPACE_SYSTEM and EXTATTR_NAMESPACE_USER, where the primary distinction lies in the access control model: user EAs are accessible based on the normal MAC and DAC file/directory protections, and system attributes are limited to kernel-originated or appropriately privileged userland requests. o These API changes occur at several levels: the namespace argument is introduced in the extattr_{get,set}_file() system call interfaces, at the vnode operation level in the vop_{get,set}extattr() interfaces, and in the UFS extended attribute implementation. Changes are also introduced in the VFS extattrctl() interface (system call, VFS, and UFS implementation), where the arguments are modified to include a namespace field, as well as modified to advoid direct access to userspace variables from below the VFS layer (in the style of recent changes to mount by adrian@FreeBSD.org). This required some cleanup and bug fixing regarding VFS locks and the VFS interface, as a vnode pointer may now be optionally submitted to the VFS_EXTATTRCTL() call. Updated documentation for the VFS interface will be committed shortly. o In the near future, the auto-starting feature will be updated to search two sub-directories to the ".attribute" directory in appropriate file systems: "user" and "system" to locate attributes intended for those namespaces, as the single filename is no longer sufficient to indicate what namespace the attribute is intended for. Until this is committed, all attributes auto-started by UFS will be placed in the EXTATTR_NAMESPACE_SYSTEM namespace. o The default POSIX.1e attribute names for ACLs and Capabilities have been updated to no longer include the '$' in their filename. As such, if you're using these features, you'll need to rename the attribute backing files to the same names without '$' symbols in front. o Note that these changes will require changes in userland, which will be committed shortly. These include modifications to the extended attribute utilities, as well as to libutil for new namespace string conversion routines. Once the matching userland changes are committed, a buildworld is recommended to update all the necessary include files and verify that the kernel and userland environments are in sync. Note: If you do not use extended attributes (most people won't), upgrading is not imperative although since the system call API has changed, the new userland extended attribute code will no longer compile with old include files. o Couple of minor cleanups while I'm there: make more code compilation conditional on FFS_EXTATTR, which should recover a bit of space on kernels running without EA's, as well as update copyright dates. Obtained from: TrustedBSD Project
|
#
2aa33d2f |
|
06-Mar-2001 |
John Baldwin <jhb@FreeBSD.org> |
Check to see if p_fd is NULL before derferencing it in checkdirs(). It's possible for us to see a process in the early stages of fork before p_fd has been initialized. Ideally, we wouldn't stick a process on the allproc list until it was fully created however.
|
#
fbedc117 |
|
02-Mar-2001 |
Adrian Chadd <adrian@FreeBSD.org> |
Mismatched MFSNAMELEN and MNAMELEN with fstype / fspath. Submitted by: Naoki Kobayashi <shibata@geo.titech.ac.jp>
|
#
f3a90da9 |
|
01-Mar-2001 |
Adrian Chadd <adrian@FreeBSD.org> |
Reviewed by: jlemon An initial tidyup of the mount() syscall and VFS mount code. This code replaces the earlier work done by jlemon in an attempt to make linux_mount() work. * the guts of the mount work has been moved into vfs_mount(). * move `type', `path' and `flags' from being userland variables into being kernel variables in vfs_mount(). `data' remains a pointer into userspace. * Attempt to verify the `type' and `path' strings passed to vfs_mount() aren't too long. * rework mount() and linux_mount() to take the userland parameters (besides data, as mentioned) and pass kernel variables to vfs_mount(). (linux_mount() already did this, I've just tidied it up a little more.) * remove the copyin*() stuff for `path'. `data' still requires copyin*() since its a pointer into userland. * set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each filesystem. This variable is generally initialised with `path', and each filesystem can override it if they want to. * NOTE: f_mntonname is intiailised with "/" in the case of a root mount.
|
#
a90ef2ae |
|
28-Feb-2001 |
Ian Dowse <iedowse@FreeBSD.org> |
The kernel did not hold a vnode reference associated with the `rootvnode' pointer, but vfs_syscalls.c's checkdirs() assumed that it did. This bug reliably caused a panic at reboot time if any filesystem had been mounted directly over /. The checkdirs() function is called at mount time to find any process fd_cdir or fd_rdir pointers referencing the covered mountpoint vnode. It transfers these to point at the root of the new filesystem. However, this process was not reversed at unmount time, so processes with a cwd/root at a mount point would unexpectedly lose their cwd/root following a mount-unmount cycle at that mountpoint. This change should fix both of the above issues. Start_init() now holds an extra vnode reference corresponding to `rootvnode', and dounmount() releases this reference when the root filesystem is unmounted just before reboot. Dounmount() now undoes the actions taken by checkdirs() at mount time; any process cdir/rdir pointers that reference the root vnode of the unmounted filesystem are transferred to the now-uncovered vnode. Reviewed by: bde, phk
|
#
91421ba2 |
|
20-Feb-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Move per-process jail pointer (p->pr_prison) to inside of the subject credential structure, ucred (cr->cr_prison). o Allow jail inheritence to be a function of credential inheritence. o Abstract prison structure reference counting behind pr_hold() and pr_free(), invoked by the similarly named credential reference management functions, removing this code from per-ABI fork/exit code. o Modify various jail() functions to use struct ucred arguments instead of struct proc arguments. o Introduce jailed() function to determine if a credential is jailed, rather than directly checking pointers all over the place. o Convert PRISON_CHECK() macro to prison_check() function. o Move jail() function prototypes to jail.h. o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the flag in the process flags field itself. o Eliminate that "const" qualifier from suser/p_can/etc to reflect mutex use. Notes: o Some further cleanup of the linux/jail code is still required. o It's now possible to consider resolving some of the process vs credential based permission checking confusion in the socket code. o Mutex protection of struct prison is still not present, and is required to protect the reference count plus some fields in the structure. Reviewed by: freebsd-arch Obtained from: TrustedBSD Project
|
#
c3d7bcdf |
|
16-Feb-2001 |
Jonathan Lemon <jlemon@FreeBSD.org> |
Introduce copyinfrom and copyinstrfrom, which can copy data from either user or kernel space. This will allow layering of os-compat (e.g.: linux) system calls. Apply the changes to mount.
|
#
9ed346ba |
|
08-Feb-2001 |
Bosko Milekic <bmilekic@FreeBSD.org> |
Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
|
#
c0c25570 |
|
12-Dec-2000 |
Jake Burkholder <jake@FreeBSD.org> |
- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.
|
#
7cc0979f |
|
08-Dec-2000 |
David Malone <dwmalone@FreeBSD.org> |
Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>
|
#
553629eb |
|
22-Nov-2000 |
Jake Burkholder <jake@FreeBSD.org> |
Protect the following with a lockmgr lock: allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid Reviewed by: jhb Obtained from: BSD/OS and netbsd
|
#
279d7226 |
|
18-Nov-2000 |
Matthew Dillon <dillon@FreeBSD.org> |
This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc... PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>
|
#
1d7e3e42 |
|
02-Nov-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Take VBLK devices further out of their missery. This should fix the panic I introduced in my previous commit on this topic.
|
#
35e0e5b3 |
|
20-Oct-2000 |
John Baldwin <jhb@FreeBSD.org> |
Catch up to moving headers: - machine/ipl.h -> sys/ipl.h - machine/mutex.h -> sys/mutex.h
|
#
a18b1f1d |
|
03-Oct-2000 |
Jason Evans <jasone@FreeBSD.org> |
Convert lockmgr locks from using simple locks to using mutexes. Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
|
#
98d39ed4 |
|
14-Sep-2000 |
Eivind Eklund <eivind@FreeBSD.org> |
Add function comments for functions missing them
|
#
1d95078a |
|
14-Sep-2000 |
Eivind Eklund <eivind@FreeBSD.org> |
Blow away COMPAT_43 support for mount
|
#
9ff5ce6b |
|
12-Sep-2000 |
Boris Popov <bp@FreeBSD.org> |
Add three new VOPs: VOP_CREATEVOBJECT, VOP_DESTROYVOBJECT and VOP_GETVOBJECT. They will be used by nullfs and other stacked filesystems to support full cache coherency. Reviewed in general by: mckusick, dillon
|
#
b4d0de58 |
|
04-Sep-2000 |
Robert Watson <rwatson@FreeBSD.org> |
o Remove commented out code which modified return values from extattr_{get,set} syscalls in the face of partial reads or writes. Obtained from: TrustedBSD Project
|
#
8577117c |
|
01-Sep-2000 |
Don Lewis <truckman@FreeBSD.org> |
access() shouldn't diddle with the contents of a potentially shared credential. Create a temporary copy of the current credential and modify the copy. Submitted by: tegge
|
#
3c2498c0 |
|
08-Aug-2000 |
Tor Egge <tegge@FreeBSD.org> |
Don't set flags on the mount structure before all permission checks have been done. Don't allow multiple mount operations with MNT_UPDATE at the same time on the same mount point. When the first mount operation completed, MNT_UPDATE was cleared in the mount structure, causing the second to complete as if it was a no-update mount operation with the following bad side effects: - mount structure inserted multiple times onto the mountlist - vp->v_mountedhere incorrectly set, causing next namei operation walking into the mountpoint to crash with a locking against myself panic. Plug a vnode leak in case vinvalbuf fails.
|
#
fc3345a4 |
|
28-Jul-2000 |
Robert Watson <rwatson@FreeBSD.org> |
o Modify extattr_{set,get}() syscalls so that partial reads and writes with an error condition such as EINTR, EWOULDBLOCK, and ERESTART, are reported to the application, not silently conceal. This behavior was copied from the {read,write}v() syscalls, and is appropriate there but not here. o Correct a bug in extattr_delete() wherein the LOCKLEAF flag is passed to the wrong argument in namei(), resulting in some unexpected errors during name resolution, and passing in an unlocked vnode. Obtained from: TrustedBSD Project
|
#
3ce7b7aa |
|
26-Jul-2000 |
Robert Watson <rwatson@FreeBSD.org> |
o Lock vnode before calling extattr_* VOP's, and modify vnode spec to allow for that. o Remember to call NDFREE() if exiting as a result of a failed vn_start_write() when snapshotting. Reviewed by: mckusick Obtained from: TrustedBSD Project
|
#
aec3bbe1 |
|
24-Jul-2000 |
Kirk McKusick <mckusick@FreeBSD.org> |
Do not need vrele(nd.ni_vp) as that is done by NDFREE(&nd, 0); Submitted by: Peter Holm <pho@freebsd.org>
|
#
f2a2857b |
|
11-Jul-2000 |
Kirk McKusick <mckusick@FreeBSD.org> |
Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
|
#
e6796b67 |
|
03-Jul-2000 |
Kirk McKusick <mckusick@FreeBSD.org> |
Move the truncation code out of vn_open and into the open system call after the acquisition of any advisory locks. This fix corrects a case in which a process tries to open a file with a non-blocking exclusive lock. Even if it fails to get the lock it would still truncate the file even though its open failed. With this change, the truncation is done only after the lock is successfully acquired. Obtained from: BSD/OS
|
#
3275cf73 |
|
03-Jul-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make the two calls from kern/* into softupdates #ifdef SOFTUPDATES, that is way cleaner than using the softupdates_stub stunt, which should be killed when convenient. Discussed with: mckusick
|
#
6c66bbed |
|
29-Jun-2000 |
Archie Cobbs <archie@FreeBSD.org> |
Move the securelevel check before loading KLD's into linker_load_file(), instead of requiring every caller of linker_load_file() to perform the check itself. This avoids netgraph loading KLD's when securelevel > 0, not to mention any future code that may call linker_load_file(). Reviewed by: dfr
|
#
7c50d772 |
|
16-Jun-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Revert part of my bioops change which implemented panic(8).
|
#
a2e7a027 |
|
16-Jun-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Virtualizes & untangles the bioops operations vector. Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
|
#
9626b608 |
|
05-May-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
|
#
36e9f877 |
|
28-Mar-2000 |
Matthew Dillon <dillon@FreeBSD.org> |
Commit major SMP cleanups and move the BGL (big giant lock) in the syscall path inward. A system call may select whether it needs the MP lock or not (the default being that it does need it). A great deal of conditional SMP code for various deadended experiments has been removed. 'cil' and 'cml' have been removed entirely, and the locking around the cpl has been removed. The conditional separately-locked fast-interrupt code has been removed, meaning that interrupts must hold the CPL now (but they pretty much had to anyway). Another reason for doing this is that the original separate-lock for interrupts just doesn't apply to the interrupt thread mechanism being contemplated. Modifications to the cpl may now ONLY occur while holding the MP lock. For example, if an otherwise MP safe syscall needs to mess with the cpl, it must hold the MP lock for the duration and must (as usual) save/restore the cpl in a nested fashion. This is precursor work for the real meat coming later: avoiding having to hold the MP lock for common syscalls and I/O's and interrupt threads. It is expected that the spl mechanisms and new interrupt threading mechanisms will be able to run in tandem, allowing a slow piecemeal transition to occur. This patch should result in a moderate performance improvement due to the considerable amount of code that has been removed from the critical path, especially the simplification of the spl*() calls. The real performance gains will come later. Approved by: jkh Reviewed by: current, bde (exception.s) Some work taken from: luoqi's patch
|
#
bd5f5da9 |
|
09-Jan-2000 |
Kirk McKusick <mckusick@FreeBSD.org> |
Add bwillwrite to all system calls that create things in the filesystem. Benchmarks that create huge trees of empty files overwhelm the buffer cache.
|
#
91f37dcb |
|
18-Dec-1999 |
Robert Watson <rwatson@FreeBSD.org> |
Second pass commit to introduce new ACL and Extended Attribute system calls, vnops, vfsops, both in /kern, and to individual file systems that require a vfsop_ array entry. Reviewed by: eivind
|
#
762e6b85 |
|
15-Dec-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
Introduce NDFREE (and remove VOP_ABORTOP)
|
#
3854a87e |
|
11-Dec-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Remove accidental pollution unrelated to previous commit. The issue here is real but has not yet been discussed with Eivind.
|
#
4f79d873 |
|
11-Dec-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Add MAP_NOSYNC feature to mmap(), and MADV_NOSYNC and MADV_AUTOSYNC to madvise(). This feature prevents the update daemon from gratuitously flushing dirty pages associated with a mapped file-backed region of memory. The system pager will still page the memory as necessary and the VM system will still be fully coherent with the filesystem. Modifications made by other means to the same area of memory, for example by write(), are unaffected. The feature works on a page-granularity basis. MAP_NOSYNC allows one to use mmap() to share memory between processes without incuring any significant filesystem overhead, putting it in the same performance category as SysV Shared memory and anonymous memory. Reviewed by: julian, alc, dg
|
#
0429e37a |
|
20-Nov-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
struct mountlist and struct mount.mnt_list have no business being a CIRCLEQ. Change them to TAILQ_HEAD and TAILQ_ENTRY respectively. This removes ugly mp != (void*)&mountlist comparisons. Requested by: phk Submitted by: Jake Burkholder jake@checker.org PR: 14967
|
#
91921bd5 |
|
18-Nov-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Ensure that garbage from the kernel stack does not wind up being returned to user mode in the spare fields of the stat structure. PR: kern/14966 Reviewed by: dillon@freebsd.org Submitted by: Kelly Yancey kbyanc@posi.net
|
#
1b727751 |
|
16-Nov-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Commit the remaining part of PR14914: Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914
|
#
dd8c04f4 |
|
13-Nov-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
Remove WILLRELE from VOP_SYMLINK Note: Previous commit to these files (except coda_vnops and devfs_vnops) that claimed to remove WILLRELE from VOP_RENAME actually removed it from VOP_MKNOD.
|
#
020024f3 |
|
13-Nov-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
Fix style bugs from last commit
|
#
edfe736d |
|
11-Nov-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
Remove WILLRELE from VOP_RENAME
|
#
5b42dac8 |
|
31-Oct-1999 |
Julian Elischer <julian@FreeBSD.org> |
Most modern OSs have the ability to flag certain mounts as ones to be ignored by default by the df(1) program. This is used mostly to avoid stat()-ing entries that do not represent "real" disk mount points (such as those made by an automounter such as amd.) It is also useful not to have to stat() these entries because it takes longer to report them that for other file systems, being that these mount points are served by a user-level file server and resulting in several context switches. Worse, if the automounter is down unexpectedly, a causal df(1) will hang in an interruptible way. PR: kern/9764 Submitted by: Erez Zadok <ezk@cs.columbia.edu>
|
#
d1f088da |
|
11-Oct-1999 |
Peter Wemm <peter@FreeBSD.org> |
Trim unused options (or #ifdef for undoc options). Submitted by: phk
|
#
3b6fb885 |
|
02-Oct-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Before we start to mess with the VFS name-cache clean things up a little bit: Isolate the namecache in its own file, and give it a dedicated malloc type.
|
#
1b5464ef |
|
29-Sep-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove v_maxio from struct vnode. Replace it with mnt_iosize_max in struct mount. Nits from: bde
|
#
2fe5bd8b |
|
25-Sep-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix a hole in jail(2). Noticed by: Alexander Bezroutchko <abb@zenon.net>
|
#
c24fda81 |
|
10-Sep-1999 |
Alfred Perlstein <alfred@FreeBSD.org> |
Seperate the export check in VFS_FHTOVP, exports are now checked via VFS_CHECKEXP. Add fh(open|stat|stafs) syscalls to allow userland to query filesystems based on (network) filehandle. Obtained from: NetBSD
|
#
c3aac50f |
|
27-Aug-1999 |
Peter Wemm <peter@FreeBSD.org> |
$Id$ -> $FreeBSD$
|
#
dbafb366 |
|
26-Aug-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Simplify the handling of VCHR and VBLK vnodes using the new dev_t: Make the alias list a SLIST. Drop the "fast recycling" optimization of vnodes (including the returning of a prexisting but stale vnode from checkalias). It doesn't buy us anything now that we don't hardlimit vnodes anymore. Rename checkalias2() and checkalias() to addalias() and addaliasu() - which takes dev_t and udev_t arg respectively. Make the revoke syscalls use vcount() instead of VALIASED. Remove VALIASED flag, we don't need it now and it is faster to traverse the much shorter lists than to maintain the flag. vfs_mountedon() can check the dev_t directly, all the vnodes point to the same one. Print the devicename in specfs/vprint(). Remove a couple of stale LFS vnode flags. Remove unimplemented/unused LK_DRAINED;
|
#
af255dc5 |
|
22-Aug-1999 |
John Polstra <jdp@FreeBSD.org> |
Go back to using microtime() to get the timestamps for {f,l,}utimes(path, NULL) for now. Bruce says I jumped the gun with my change in revision 1.131, or maybe it should use nanotime(), or maybe it shouldn't be decided in the VFS layer at all. I'm leaving it with the old behavior until the Trans-Pacific Internet Vulcan Mind Meld yields fuller understanding.
|
#
4f2a0d4f |
|
21-Aug-1999 |
John Polstra <jdp@FreeBSD.org> |
Use the new vfs_timestamp() function to create the timestamps used by utimes(path, NULL). This gives them the same precision as the timestamps produced by write operations. Do likewise for lutimes() and futimes(). Suggested by bde.
|
#
f4af31cb |
|
12-Aug-1999 |
Alfred Perlstein <alfred@FreeBSD.org> |
Replace a redundant vfs_object_create() call (already done in vn_open) with a KASSERT. Reviewed by: Eivind, Alan Cox
|
#
e32c66c5 |
|
04-Aug-1999 |
Brian Feldman <green@FreeBSD.org> |
Fix fd race conditions (during shared fd table usage.) Badfileops is now used in f_ops in place of NULL, and modifications to the files are more carefully ordered. f_ops should also be set to &badfileops upon "close" of a file. This does not fix other problems mentioned in this PR than the first one. PR: 11629 Reviewed by: peter
|
#
711103c1 |
|
03-Aug-1999 |
Warner Losh <imp@FreeBSD.org> |
o Typo in prior version kept it from compiling (blush). Noticed by: Nobody! o Add comment about why we restrict chflags to root for devices. o nit noticed by bde wrt return values.
|
#
e82ef978 |
|
03-Aug-1999 |
Warner Losh <imp@FreeBSD.org> |
brucify: o use suser_xxx rather than suser to support JAIL code. o KNF comment convention o use vp->type rather than vaddr.type and eliminate call to VOP_GETATTR. Bruce says that vp->type is valid at this point. Submitted by: bde. Not fixed: o return (value) o Comment needs to be longer and more explicit. It will be after the advisory.
|
#
f76f09c1 |
|
02-Aug-1999 |
Warner Losh <imp@FreeBSD.org> |
Only allow root to set file flags on devices.
|
#
ab533dd0 |
|
29-Jul-1999 |
Brian Feldman <green@FreeBSD.org> |
lutimes() bug: FOLLOW should be NOFOLLOW for this one. Submitted by: Dan Nelson <dnelson@emsphone.com>
|
#
67452993 |
|
26-Jul-1999 |
Alan Cox <alc@FreeBSD.org> |
Add sysctl and support code to allow directories to be VMIO'd. The default setting for the sysctl is OFF, which is the historical operation. Submitted by: dillon
|
#
75c13541 |
|
28-Apr-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
This Implements the mumbled about "Jail" feature. This is a seriously beefed up chroot kind of thing. The process is jailed along the same lines as a chroot does it, but with additional tough restrictions imposed on what the superuser can do. For all I know, it is safe to hand over the root bit inside a prison to the customer living in that prison, this is what it was developed for in fact: "real virtual servers". Each prison has an ip number associated with it, which all IP communications will be coerced to use and each prison has its own hostname. Needless to say, you need more RAM this way, but the advantage is that each customer can run their own particular version of apache and not stomp on the toes of their neighbors. It generally does what one would expect, but setting up a jail still takes a little knowledge. A few notes: I have no scripts for setting up a jail, don't ask me for them. The IP number should be an alias on one of the interfaces. mount a /proc in each jail, it will make ps more useable. /proc/<pid>/status tells the hostname of the prison for jailed processes. Quotas are only sensible if you have a mountpoint per prison. There are no privisions for stopping resource-hogging. Some "#ifdef INET" and similar may be missing (send patches!) If somebody wants to take it from here and develop it into more of a "virtual machine" they should be most welcome! Tools, comments, patches & documentation most welcome. Have fun... Sponsored by: http://www.rndassociates.com/ Run for almost a year by: http://www.servetheweb.com/
|
#
f711d546 |
|
27-Apr-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Suser() simplification: 1: s/suser/suser_xxx/ 2: Add new function: suser(struct proc *), prototyped in <sys/proc.h>. 3: s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/ The remaining suser_xxx() calls will be scrutinized and dealt with later. There may be some unneeded #include <sys/cred.h>, but they are left as an exercise for Bruce. More changes to the suser() API will come along with the "jail" code.
|
#
cc7532aa |
|
23-Mar-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a sysctl variable which can help stop chroot(2) escapes. kern.chroot_allow_open_directories = 0 chroot(2) fails if there are open directories. kern.chroot_allow_open_directories = 1 (default) chroot(2) fails if there are open directories and the process is subject of a previous chroot(2). kern.chroot_allow_open_directories = anything else filedescriptors are not checked. (old behaviour). I'm very interested in reports about software which breaks when running with the default setting.
|
#
cb11191c |
|
02-Mar-1999 |
Julian Elischer <julian@FreeBSD.org> |
Slight cleanup of code resurected for union mounts.. Submitted by: Tony Finch <dot@dotat.at>
|
#
1871f6cd |
|
27-Feb-1999 |
Julian Elischer <julian@FreeBSD.org> |
Fix code for union mounts Accidentally deleted by peter when he extracted the unionfs stuff in 1.109 Submitted by: Tony Finch <dot@dotat.at>
|
#
a5c9bce7 |
|
25-Feb-1999 |
Bruce Evans <bde@FreeBSD.org> |
Added a used #include (don't depend on "vnode_if.h" including <sys/buf.h>).
|
#
ce02431f |
|
16-Feb-1999 |
Doug Rabson <dfr@FreeBSD.org> |
* Change sysctl from using linker_set to construct its tree using SLISTs. This makes it possible to change the sysctl tree at runtime. * Change KLD to find and register any sysctl nodes contained in the loaded file and to unregister them when the file is unloaded. Reviewed by: Archie Cobbs <archie@whistle.com>, Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)
|
#
4e48a6bf |
|
29-Jan-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use suser() to determine super-user-ness. Collapse some duplicated checks. Reviewed by: bde
|
#
697457a1 |
|
28-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Fix warnings related to -Wall -Wcast-qual
|
#
d254af07 |
|
27-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
#
73a6265d |
|
23-Jan-1999 |
Bruce Evans <bde@FreeBSD.org> |
Go back to only supporting revoke() for bdevs and cdevs. It is very buggy for fifos, and no one seems to have investigated its behaviour on other types of files. It has been broken since the Lite2 merge in rev.1.54. Nagged about by: Brian Feldman (green@unixhelp.org)
|
#
fb116777 |
|
05-Jan-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
Remove the 'waslocked' parameter to vfs_object_create().
|
#
4c016975 |
|
12-Dec-1998 |
Matthew Dillon <dillon@FreeBSD.org> |
PR: kern/8965 Obtained from: Stephen Clawson <sclawson@cs.utah.edu> Wakeup anyone waiting on a mount point prior to returning from umount, whether an error occurs or not. Fixes a stat/NFS-umount race and other potential future problems. Fix taken from bug/pr which also indicated that the same fix has already been applied to OpenBSD and NetBSD.
|
#
02fc72db |
|
03-Nov-1998 |
Peter Wemm <peter@FreeBSD.org> |
make mount(2) automatically kldload modules if the requested filesystem isn't present.
|
#
8c14bf40 |
|
03-Nov-1998 |
Peter Wemm <peter@FreeBSD.org> |
Change the #ifdef UNION code into a callable hook. Arrange to have this set up when unionfs is present, either statically or as a kld module.
|
#
b421db37 |
|
31-Oct-1998 |
Peter Wemm <peter@FreeBSD.org> |
The last argument to vm_object_page_clean() are now bit flags, rather than the old true/false. While here, have vfs_msync() only call vm_object_page_clean() with OBJPC_SYNC if called with MNT_WAIT flags. vfs_msync() is called at unmount time (with MNT_WAIT) and from the syncer process (formerly update). This should make dirty mmap writebacks a little less nasty. I have tested this a little with SOFTUPDATES enabled, but I don't normally use it since I've been badly burned too many times.
|
#
e266594c |
|
24-Sep-1998 |
Luoqi Chen <luoqi@FreeBSD.org> |
Eliminate a race in VOP_FSYNC() when softupdates is enabled. Submitted by: Kirk McKusick <mckusick@McKusick.COM> Two minor changes are also included, 1. Remove gratuitious checks for error return from vn_lock with LK_RETRY set, vn_lock should always succeed in these cases. 2. Back out change rev. 1.36->1.37, which unnecessarily makes async mount a little more unstable. It also keeps us in sync with other BSDs. Suggested by: Bruce Evans <bde@zeta.org.au>
|
#
a58915fc |
|
09-Sep-1998 |
Tor Egge <tegge@FreeBSD.org> |
Don't keep the underlying directory locked while performing the file system specific VFS_MOUNT operation. PR: 1067
|
#
a23d65bf |
|
14-Jul-1998 |
Bruce Evans <bde@FreeBSD.org> |
Cast pointers to uintptr_t/intptr_t instead of to u_long/long, respectively. Most of the longs should probably have been u_longs, but this changes is just to prevent warnings about casts between pointers and integers of different sizes, not to fix poorly chosen types.
|
#
e25169f2 |
|
02-Jul-1998 |
David Greenman <dg@FreeBSD.org> |
Reset MNT_ASYNC flag if needed if unmount() should fail. Submitted by: Paul Saab <paul@mu.org>
|
#
0d3dd8fb |
|
08-Jun-1998 |
John Dyson <dyson@FreeBSD.org> |
Remove some junk left over from a previous commit. Submitted by: phk
|
#
ecbb00a2 |
|
07-Jun-1998 |
Doug Rabson <dfr@FreeBSD.org> |
This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change. The prototype FreeBSD/alpha machdep will follow in a couple of days time.
|
#
1f562172 |
|
10-May-1998 |
John Dyson <dyson@FreeBSD.org> |
Fix the futimes/undelete/utrace conflict with other BSD's. Note that the only common usage of utrace (the possible problem with this commit) is with malloc, so this should be a real problem. Add the various NetBSD syscalls that allow full emulation of their development environment.
|
#
7be2d300 |
|
06-May-1998 |
Mike Smith <msmith@FreeBSD.org> |
In the words of the submitter: --------- Make callers of namei() responsible for releasing references or locks instead of having the underlying filesystems do it. This eliminates redundancy in all terminal filesystems and makes it possible for stacked transport layers such as umapfs or nullfs to operate correctly. Quality testing was done with testvn, and lat_fs from the lmbench suite. Some NFS client testing courtesy of Patrik Kudo. vop_mknod and vop_symlink still release the returned vpp. vop_rename still releases 4 vnode arguments before it returns. These remaining cases will be corrected in the next set of patches. --------- Submitted by: Michael Hancock <michaelh@cet.co.jp>
|
#
59bad7c5 |
|
19-Apr-1998 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Backed out lseek changes.
|
#
25096724 |
|
18-Apr-1998 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Return EINVAL and do not change file pointer if resulting offset is negative. PR: kern/6184
|
#
5ddc8ded |
|
08-Apr-1998 |
Wolfram Schneider <wosch@FreeBSD.org> |
New mount option nosymfollow. If enabled, the kernel lookup() function will not follow symbolic links on the mounted file system and return EACCES (Permission denied).
|
#
006b9b7d |
|
29-Mar-1998 |
John Dyson <dyson@FreeBSD.org> |
Correct a significant problem with the softupdates port. Allow fsync to work properly within the softupdates framework, and thereby eliminate some unfortunate panics.
|
#
b1897c19 |
|
08-Mar-1998 |
Julian Elischer <julian@FreeBSD.org> |
Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman) Submitted by: Kirk McKusick (mcKusick@mckusick.com) Obtained from: WHistle development tree
|
#
8f9110f6 |
|
07-Mar-1998 |
John Dyson <dyson@FreeBSD.org> |
This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated. 1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.
|
#
9f24f214 |
|
14-Feb-1998 |
John Dyson <dyson@FreeBSD.org> |
Make the rootdir handling more consistent. Now, processes always have a root vnode associated with them, and no special checks for the null case are needed. Submitted by: terry@freebsd.org
|
#
3217023e |
|
07-Feb-1998 |
John Dyson <dyson@FreeBSD.org> |
Fix a problem with vn_lock in fsync.
|
#
0b08f5f7 |
|
05-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Back out DIAGNOSTIC changes.
|
#
47cfdb16 |
|
04-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Turn DIAGNOSTIC into a new-style option.
|
#
95e5e988 |
|
05-Jan-1998 |
John Dyson <dyson@FreeBSD.org> |
Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does. When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex. When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore. A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes. Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.
|
#
2be70f79 |
|
28-Dec-1997 |
John Dyson <dyson@FreeBSD.org> |
Lots of improvements, including restructring the caching and management of vnodes and objects. There are some metadata performance improvements that come along with this. There are also a few prototypes added when the need is noticed. Changes include: 1) Cleaning up vref, vget. 2) Removal of the object cache. 3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore. 4) Correct some missing LK_RETRY's in vn_lock. 5) Correct the page range in the code for msync. Be gentle, and please give me feedback asap.
|
#
675ea6f0 |
|
26-Dec-1997 |
Bruce Evans <bde@FreeBSD.org> |
Unspammed nested include of <vm/vm_zone.h>.
|
#
5591b823d |
|
16-Dec-1997 |
Eivind Eklund <eivind@FreeBSD.org> |
Make COMPAT_43 and COMPAT_SUNOS new-style options.
|
#
52aef196 |
|
02-Dec-1997 |
Bruce Evans <bde@FreeBSD.org> |
Cleaned up __getcwd(). This should be cosmetic except disabled calls are now counted. Reviewed by: phk
|
#
865737f4 |
|
21-Nov-1997 |
Bruce Evans <bde@FreeBSD.org> |
Staticized. Use OID_AUTO instead of a magic number for the debug.syncprt sysctl. (This sysctl doesn't actually work. FreeBSD nuked it, but parts of it were mismerged from Lite2. It is not very good, but better than nothing.)
|
#
d02601f8 |
|
21-Nov-1997 |
Bruce Evans <bde@FreeBSD.org> |
Fixed rev.1.81. mp->mnt_kern_flag was restored in the non-error case of `mount -u'. This only matters for `mount -u' competing with unmounts. If I understand the locking correctly: if mount() blocks, then unmount() may run and set mp->kern_flag for the same mp. Then unmount() blocks waiting for mount() to finish. When unmount() continues, its MNTK flags (MNTK_UNMOUNT and MNTK_MWAIT) may have been clobbered. Didn't fix old bugs: - restoring mp->mnt_kern_flag is wrong for the same reasons in the error case. - the error case of unmount() seems to be broken too: (a) MNTK_UNMOUNT gets clobbered, although another unmount() may have set it. Perhaps it shouldn't be set until after the full lock is aquired. (b) MNTK_MWAIT isn't honoured. Fixed a nearby style bug.
|
#
52bf64c7 |
|
12-Nov-1997 |
Julian Elischer <julian@FreeBSD.org> |
Reviewed by: hackers@freebsd.org in general Obtained from: Whistle Communications tree Add an option to the way UFS works dependent on the SUID bit of directories This changes makes things a whole lot simpler on systems running as fileservers for PCs and MACS. to enable the new code you must 1/ enable option SUIDDIR on the kernel. 2/ mount the filesystem with option suiddir. hopefully this makes it difficult enough for people to do this accidentally. see the new chmod(2) man page for detailed info.
|
#
b1f4a44b |
|
11-Nov-1997 |
Julian Elischer <julian@FreeBSD.org> |
Reviewed by: various. Ever since I first say the way the mount flags were used I've hated the fact that modes, and events, internal and exported, and short-term and long term flags are all thrown together. Finally it's annoyed me enough.. This patch to the entire FreeBSD tree adds a second mount flag word to the mount struct. it is not exported to userspace. I have moved some of the non exported flags over to this word. this means that we now have 8 free bits in the mount flags. There are another two that might well move over, but which I'm not sure about. The only user visible change would have been in pstat -v, except that davidg has disabled it anyhow. I'd still like to move the state flags and the 'command' flags apart from each other.. e.g. MNT_FORCE really doesn't have the same semantics as MNT_RDONLY, but that's left for another day.
|
#
cb226aaa |
|
06-Nov-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead. This fixes a boatload of compiler warning, and removes a lot of cruft from the sources. I have not removed the /*ARGSUSED*/, they will require some looking at. libkvm, ps and other userland struct proc frobbing programs will need recompiled.
|
#
1315bcf7 |
|
28-Oct-1997 |
Bruce Evans <bde@FreeBSD.org> |
Fixed style bugs in open() fix.
|
#
1c1ff294 |
|
23-Oct-1997 |
KATO Takenori <kato@FreeBSD.org> |
Disallow non-root mount. If you want to allow non-root mount, change vfs.usermount into 1 with sysctl.
|
#
2094bd73 |
|
22-Oct-1997 |
Joerg Wunsch <joerg@FreeBSD.org> |
Reject attempts to call open() with an illegal combination of O_RDONLY, O_WRONLY, O_RDWR.
|
#
a1c995b6 |
|
12-Oct-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Last major round (Unless Bruce thinks of somthing :-) of malloc changes. Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde
|
#
ad324c88 |
|
28-Sep-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix handling of nested mountpoints in __getcwd() Detected by: Simon Shapiro <Shimon@i-Connect.Net>
|
#
81bca6dd |
|
27-Sep-1997 |
KATO Takenori <kato@FreeBSD.org> |
Clustered read and write are switched at mount-option level. 1. Clustered I/O is switched by the MNT_NOCLUSTERR and MNT_NOCLUSTERW bits of the mnt_flag. The sysctl variables, vfs.foo.doclusterread and vfs.foo.doclusterwrite are deleted. Only mount option can control clustered I/O from userland. 2. When foofs_mount mounts block device, foofs_mount checks D_CLUSTERR and D_CLUSTERW bits of the d_flags member in the block device switch table. If D_NOCLUSTERR / D_NOCLUSTERW are set, MNT_NOCLUSTERR / MNT_NOCLUSTERW bits will be set. In this case, MNT_NOCLUSTERR and MNT_NOCLUSTERW cannot be cleared from userland. 3. Vnode driver disables both clustered read and write. 4. Union filesystem disables clutered write. Reviewed by: bde
|
#
00544193 |
|
24-Sep-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
A couple of handles to tweak, more statistics.
|
#
99448ed1 |
|
20-Sep-1997 |
John Dyson <dyson@FreeBSD.org> |
Change the M_NAMEI allocations to use the zone allocator. This change plus the previous changes to use the zone allocator decrease the useage of malloc by half. The Zone allocator will be upgradeable to be able to use per CPU-pools, and has more intelligent usage of SPLs. Additionally, it has reasonable stats gathering capabilities, while making most calls inline.
|
#
044839fb |
|
16-Sep-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't leak memory, from sef. Stylistic nits and a blunder, from bde.
|
#
7874d7a3 |
|
15-Sep-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Solve race-condition, return path in normal order. A couple of stylistic nits from Bruce. If your libc contains version 1.11 or 1.12 of getcwd.c, (ie: if you recompiled libc one of the last couple of days): >>> Recompile LIBC before you boot a new kernel <<< A new libc will deal with both old and new kernels.
|
#
d56f6402 |
|
15-Sep-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Deal more correctly with mountpoints.
|
#
7822f1c6 |
|
14-Sep-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a __getcwd() syscall. This is intentionally undocumented, but all it does is to try to figure the pwd out from the vfs namecache, and return a reversed string to it. libc:getcwd() is responsible for flipping it back.
|
#
e4ba6a82 |
|
02-Sep-1997 |
Bruce Evans <bde@FreeBSD.org> |
Removed unused #includes.
|
#
f6b4c285 |
|
17-Jul-1997 |
Doug Rabson <dfr@FreeBSD.org> |
Merge WebNFS support from NetBSD Obtained from: NetBSD
|
#
42146e37 |
|
04-Apr-1997 |
Doug Rabson <dfr@FreeBSD.org> |
[Previous comment was incorrect for these files] Added calls to VFS lock debugging macros to make fixing filesystems' locking easier.
|
#
de15ef6a |
|
04-Apr-1997 |
Doug Rabson <dfr@FreeBSD.org> |
Add a function vop_sharedlock which a copy of vop_nolock without the implementation #ifdef out. This can be used for now by NFS. As soon as all the other filesystems' locking is fixed, this can go away. Print the vnode address in vprint for easier debugging.
|
#
57862eed |
|
30-Mar-1997 |
Peter Wemm <peter@FreeBSD.org> |
Code to do lchown(2), copied from chown(2) except it's NOFOLLOW in ND_INIT instead of FOLLOW.
|
#
6c14d95d |
|
30-Mar-1997 |
Peter Wemm <peter@FreeBSD.org> |
Treat symlinks as first class citizens with their own uid/gid rather than as shadows of their containing directory. This should solve the problem of users not being able to delete their symlinks from /tmp once and for all. Symlinks do not have modes though, they are accessable to everything that can read the directory (as before). They are made to show this fact at lstat time (they appear as mode 0777 always, since that's how the the lookup routines in the kernel treat them). More commits will follow, eg: add a real lchown() syscall and man pages.
|
#
8f89943e |
|
23-Mar-1997 |
Guido van Rooij <guido@FreeBSD.org> |
Add generation number randomization. Newly created filesystems wil now automatically have random generation numbers. The kenel way of handling those also changed. Further it is advised to run fsirand on all your nfs exported filesystems. the code is mostly copied from OpenBSD, with the randomization chanegd to use /dev/urandom Reviewed by: Garrett Obtained from: OpenBSD
|
#
3ac4d1ef |
|
22-Mar-1997 |
Bruce Evans <bde@FreeBSD.org> |
Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined. Fixed everything that depended on getting fcntl.h stuff from the wrong place. Most things don't depend on file.h stuff at all.
|
#
3a558f83 |
|
04-Mar-1997 |
Mike Smith <msmith@FreeBSD.org> |
Check that vp->v_mount is non-null in fsync() before dereferencing it to obtain the mountpoint's MNT_ASYNC flag. This is a Very Definite Last-Minute 2.2 Bugfix Candidate. Reviewed by: sef
|
#
6875d254 |
|
22-Feb-1997 |
Peter Wemm <peter@FreeBSD.org> |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
61f84e5b |
|
12-Feb-1997 |
Mike Pritchard <mpp@FreeBSD.org> |
Don't depend on FIFO being defined to enable mkfifo. It is now always compiled. Submitted by: bde
|
#
72a5ee14 |
|
12-Feb-1997 |
Mike Pritchard <mpp@FreeBSD.org> |
Add function protypes for the new Lite2 unionfs functions.
|
#
820d8cf4 |
|
11-Feb-1997 |
Mike Pritchard <mpp@FreeBSD.org> |
Comment out a call to the #ifdef DIAGNOSTIC routine vfs_bufstats(). This routine was not imported in the Lite2 merge.
|
#
996c772f |
|
09-Feb-1997 |
John Dyson <dyson@FreeBSD.org> |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
1130b656 |
|
14-Jan-1997 |
Jordan K. Hubbard <jkh@FreeBSD.org> |
Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
bb65f5a1 |
|
19-Dec-1996 |
Bruce Evans <bde@FreeBSD.org> |
Fixed lseek() on named pipes. It always succeeded but should always fail. Broke locking on named pipes in the same way as locking on non-vnodes (wrong errno). This will be fixed later. The fix involves negative logic. Named pipes are now distinguished from other types of files with vnodes, and there is additional code to handle vnodes and named pipes in the same way only where that makes sense (not for lseek, locking or TIOCSCTTY).
|
#
030e2e9e |
|
19-Sep-1996 |
Nate Williams <nate@FreeBSD.org> |
In sys/time.h, struct timespec is defined as: /* * Structure defined by POSIX.4 to be like a timeval. */ struct timespec { time_t ts_sec; /* seconds */ long ts_nsec; /* and nanoseconds */ }; The correct names of the fields are tv_sec and tv_nsec. Reminded by: James Drobina <jdrobina@infinet.com>
|
#
b71fec07 |
|
03-Sep-1996 |
Bruce Evans <bde@FreeBSD.org> |
Eliminated nested include of <sys/unistd.h> in <sys/file.h> in the kernel. Include it directly in the few places where it is used. Reduced some #includes of <sys/file.h> to #includes of <sys/fcntl.h> or nothing.
|
#
9e043042 |
|
03-Sep-1996 |
David Greenman <dg@FreeBSD.org> |
Implemented kernel side of MNT_NOATIME mount option. This option disables the file access time update on reads and can be useful in reducing filesystem overhead in cases where the access time is not important (like Usenet news spools).
|
#
472fe5e4 |
|
24-May-1996 |
Peter Wemm <peter@FreeBSD.org> |
Dont allow directories to be link()ed or unlink()ed, even for root (returns EPERM always, the errno is specified by POSIX). If you really have a desperate need to link or unlink a directory, you can use fsdb. :-) This should stop any chance of ftpd, rdist, "rm -rf", etc from bugging out and damaging the filesystem structure or loosing races with malicious users. Reviewed by: davidg, bde
|
#
d03b4017 |
|
10-May-1996 |
Bruce Evans <bde@FreeBSD.org> |
Hide options for emulators and static file systems in opt_dontuse.h. These options only apply at config time. Using them at compile time would break the corresponding lkms.
|
#
edbfedac |
|
11-Mar-1996 |
Peter Wemm <peter@FreeBSD.org> |
Import 4.4BSD-Lite2 onto the vendor branch, note that in the kernel, all files are off the vendor branch, so this should not change anything. A "U" marker generally means that the file was not changed in between the 4.4Lite and Lite-2 releases, and does not need a merge. "C" generally means that there was a change. [note new unused (in this form) syscalls.conf, to be 'cvs rm'ed]
|
#
c128d215 |
|
16-Jan-1996 |
David Greenman <dg@FreeBSD.org> |
Make sure the mountpoint is marked busy before doing operations on it. This fixes a panic that freefall suffered last night. Obtained partially from 4.4-lite2, but minus the new bug that it introduced
|
#
692910e6 |
|
05-Jan-1996 |
Garrett Wollman <wollman@FreeBSD.org> |
convert FDESC, KERNFS, NULLFS, PORTAL, UMAPFS, and UNION to the new style of options.
|
#
27a0b398 |
|
17-Dec-1995 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Staticize. Unstaticize a function in scsi/scsi_base that was used, with an undocumented option. My last count on the LINT kernel shows: Total symbols: 3647 unref symbols: 463 undef symbols: 4 1 ref symbols: 1751 2 ref symbols: 485 Approaching the pain threshold now.
|
#
a316d390 |
|
10-Dec-1995 |
John Dyson <dyson@FreeBSD.org> |
Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.
|
#
efeaf95a |
|
06-Dec-1995 |
David Greenman <dg@FreeBSD.org> |
Untangled the vm.h include file spaghetti.
|
#
75a85811 |
|
18-Nov-1995 |
Bruce Evans <bde@FreeBSD.org> |
Fixed the errno returned by rename("dir1", "dir2/."). It was EISDIR (duh); translate it to EINVAL which is the errno for other renames to ".".
|
#
395e6735 |
|
14-Nov-1995 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Change some of the debug sysctl vars. The semantics of these will change.
|
#
c8cbd868 |
|
13-Nov-1995 |
Bruce Evans <bde@FreeBSD.org> |
Fixed a cast in olseek(). Fixed confusing order of declarations of getvnode()'s args.
|
#
d2d3e875 |
|
11-Nov-1995 |
Bruce Evans <bde@FreeBSD.org> |
Included <sys/sysproto.h> to get central declarations for syscall args structs and prototypes for syscalls. Ifdefed duplicated decentralized declarations of args structs. It's convenient to have this visible but they are hard to maintain. Some are already different from the central declarations. 4.4lite2 puts them in comments in the function headers but I wanted to avoid the large changes for that.
|
#
4dc4e0d2 |
|
05-Nov-1995 |
John Dyson <dyson@FreeBSD.org> |
Make MNT_ASYNC more effective for UFS. It should not be too much more dangerous than the original MNT_ASYNC. There might be some minor security considerations due to data writes not being posted as promptly as before. Meta-data operations are still not quite as fast as Linux, but streaming I/O is still higher.
|
#
046bc053 |
|
04-Nov-1995 |
Bruce Evans <bde@FreeBSD.org> |
Prototype getvnode() in the right place (where ibcs2_stat.c can see it).
|
#
d68a4190 |
|
22-Oct-1995 |
David Greenman <dg@FreeBSD.org> |
Moved the filesystem read-only check out of the syscalls and into the filesystem layer, as was done in lite-2. Merged in some other cosmetic changes while I was at it. Rewrote most of msdosfs_access() to be more like ufs_access() and to include the FS read-only check. Obtained from: partially from 4.4BSD-lite2
|
#
ad7507e2 |
|
07-Oct-1995 |
Steven Wallace <swallace@FreeBSD.org> |
Remove prototype definitions from <sys/systm.h>. Prototypes are located in <sys/sysproto.h>. Add appropriate #include <sys/sysproto.h> to files that needed protos from systm.h. Add structure definitions to appropriate files that relied on sys/systm.h, right before system call definition, as in the rest of the kernel source. In kern_prot.c, instead of using the dummy structure "args", create individual dummy structures named <syscall>_args. This makes life easier for prototype generation.
|
#
2b14f991 |
|
28-Aug-1995 |
Julian Elischer <julian@FreeBSD.org> |
Reviewed by: julian with quick glances by bruce and others Submitted by: terry (terry lambert) This is a composite of 3 patch sets submitted by terry. they are: New low-level init code that supports loadbal modules better some cleanups in the namei code to help terry in 16-bit character support some changes to the mount-root code to make it a little more modular.. NOTE: mounting root off cdrom or NFS MIGHT be broken as I haven't been able to test those cases.. certainly mounting root of disk still works just fine.. mfs should work but is untested. (tomorrows task) The low level init stuff includes a total rewrite of init_main.c to make it possible for new modules to have an init phase by simply adding an entry to a TEXT_SET (or is it DATA_SET) list. thus a new module can be added to the kernel without editing any other files other than the 'files' file.
|
#
cf2455a3 |
|
17-Aug-1995 |
Bruce Evans <bde@FreeBSD.org> |
The `cred' and `proc' args were missing for some VOP_OPEN() and VOP_CLOSE() calls. Found by: gcc -Wstrict-prototypes after I supplied some of the 5000+ missing prototypes. Now I have 9000+ lines of warnings and errors about bogus conversions of function pointers.
|
#
628641f8 |
|
11-Aug-1995 |
David Greenman <dg@FreeBSD.org> |
Converted mountlist to a CIRCLEQ. Partially obtained from: 4.4BSD-Lite2
|
#
47777413 |
|
01-Aug-1995 |
David Greenman <dg@FreeBSD.org> |
Removed my special-case hack for VOP_LINK and fixed the problem with the wrong vp's ops vector being used by changing the VOP_LINK's argument order. The special-case hack doesn't go far enough and breaks the generic bypass routine used in some non-leaf filesystems. Pointed out by Kirk McKusick.
|
#
70eec742 |
|
30-Jul-1995 |
Bruce Evans <bde@FreeBSD.org> |
Ignore trailing slashes in pathnames that "refer to a directory", as is required to be POSIXLY_CORRECT and "right". I interpret "referring to a directory" as being a directory or becoming a directory. E.g., the trailing slashes in mkdir("/nonesuch/"), rename("/tmp", /nonesuch/") and link("/tmp", "/root_can_like_dirs/") are ignored because the target will become a directory if the syscall succeeds. A trailing slash on a symlink causes the symlink to be followed (this is a bug if the symlink doesn't point to a directory; fix later).
|
#
24a1cce3 |
|
13-Jul-1995 |
David Greenman <dg@FreeBSD.org> |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!! Much needed overhaul of the VM system. Included in this first round of changes: 1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers". 2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items. 3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed. 4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug. 5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance. 6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain. 7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance. 8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed. 9) Some almost useless debugging code removed. 10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology. 11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended. 12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course). 13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE. 14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes) TODO: 1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size. 2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness. 3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind. 4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems. 5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
#
aa2cabb9 |
|
27-Jun-1995 |
David Greenman <dg@FreeBSD.org> |
1) Converted v_vmdata to v_object. 2) Removed unnecessary vm_object_lookup()/pager_cache(object, TRUE) pairs after vnode_pager_alloc() calls - the object is already guaranteed to be persistent. 3) Removed some gratuitous casts.
|
#
98796526 |
|
28-Jun-1995 |
David Greenman <dg@FreeBSD.org> |
Fixed VOP_LINK argument order botch.
|
#
61f5d510 |
|
21-May-1995 |
David Greenman <dg@FreeBSD.org> |
Changes to fix the following bugs: 1) Files weren't properly synced on filesystems other than UFS. In some cases, this lead to lost data. Most likely would be noticed on NFS. The fix is to make the VM page sync/object_clean general rather than in each filesystem. 2) Mixing regular and mmaped file I/O on NFS was very broken. It caused chunks of files to end up as zeroes rather than the intended contents. The fix was to fix several race conditions and to kludge up the "b_dirtyoff" and "b_dirtyend" that NFS relies upon - paying attention to page modifications that occurred via the mmapping. Reviewed by: David Greenman Submitted by: John Dyson
|
#
1469eec8 |
|
15-May-1995 |
David Greenman <dg@FreeBSD.org> |
Fixed incompleteness that would allow dirty filesystems to get mounted when the single user shell was terminated. These changes disallow mounting or R/W upgrading filesystems that are dirty unless "-f" (force) option is used with mount. /etc/rc has been modified to abort the startup if one or more non-nfs partitions fail to mount. Reviewed by: Poul-Henning Kamp, Rod Grimes
|
#
c9ae46b1 |
|
02-May-1995 |
David Greenman <dg@FreeBSD.org> |
Removed unused variable caused by last commit.
|
#
beef0195 |
|
02-May-1995 |
David Greenman <dg@FreeBSD.org> |
Fix for sync() to close a potential panic with accessing a mount struct that had been freed. Submitted by: John Dyson
|
#
2547597b |
|
29-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Added a set of braces to make the compiler happy.
|
#
ab828ab8 |
|
19-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Moved call to vnode_pager_uncache in rename() to before the VOP_RENAME. It was previously after the VOP_RENAME and the reference and lock on the vnode had already been lost, allowing interesting internel inconsistencies. This is one of the two reasons why freefall was crashing every hour or two (the other being nullfs bugs). Don't call vnode_pager_uncache in revoke(). revoke() is only allowed on VCHR and VBLK vnodes.
|
#
b5e8ce9f |
|
16-Mar-1995 |
Bruce Evans <bde@FreeBSD.org> |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
519b3d1a |
|
27-Feb-1995 |
David Greenman <dg@FreeBSD.org> |
Do a vnode_pager_uncache after the VOP_RENAME to lose the remaining reference to the old vnode. Suggested by: Bruce Evans
|
#
2655f626 |
|
13-Feb-1995 |
David Greenman <dg@FreeBSD.org> |
In sync(), don't dereference the proc pointer if it's NULL. Should fix most or all of the problems with calling sync() without a curproc (which can happen in machdep.c during a panic sync).
|
#
c4a7b7e1 |
|
04-Nov-1994 |
David Greenman <dg@FreeBSD.org> |
From tim@cs.city.ac.uk (Tim Wilkinson): Find enclosed a short bugfix to get the union filesystem up and running in FreeBSD-current. We don't think we've got all the problems yet but these fixes sort out the major ones (which mostly concert bad locking of vnodes), no doubt we'll post others as necessary. Known problems include the inability of the umount command (not the system call) to unmount unions in certain circumstances (this is due the way "realpath" works), and the failure of direntries to always get all available files in unioned subdirectories. We are, as they say, working on it. Submitted by: tim@cs.city.ac.uk (Tim Wilkinson)
|
#
091b0456 |
|
20-Oct-1994 |
Garrett Wollman <wollman@FreeBSD.org> |
Make my ALLDEVS kernel compile (basically, LINT minus a lot of options). This involves fixing a few things I broke last time.
|
#
17b9f9f4 |
|
14-Oct-1994 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix the problem with panics when mounting on nonexistant directories. Probably my fault in the first place...
|
#
99ec0d5b |
|
11-Oct-1994 |
Søren Schmidt <sos@FreeBSD.org> |
Removed static declaration of getvnode() (used in ibcs2)
|
#
dcd01eb3 |
|
08-Oct-1994 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Cosmetics: added ()'s and fixed prinf-formats to make gcc silent.
|
#
8e58bf68 |
|
05-Oct-1994 |
David Greenman <dg@FreeBSD.org> |
Stuff object into v_vmdata rather than pager. Not important which at the moment, but will be in the future. Other changes mostly cosmetic, but are made for future VMIO considerations. Submitted by: John Dyson
|
#
797f2d22 |
|
02-Oct-1994 |
Poul-Henning Kamp <phk@FreeBSD.org> |
All of this is cosmetic. prototypes, #includes, printfs and so on. Makes GCC a lot more silent.
|
#
9abf4d6e |
|
28-Sep-1994 |
Doug Rabson <dfr@FreeBSD.org> |
Make NFS ask the filesystems for directory cookies instead of making them itself.
|
#
c9b1d604 |
|
22-Sep-1994 |
Garrett Wollman <wollman@FreeBSD.org> |
More loadable VFS changes: - Make a number of filesystems work again when they are statically compiled (blush) - FIFOs are no longer optional; ``options FIFO'' removed from distributed config files.
|
#
c901836c |
|
20-Sep-1994 |
Garrett Wollman <wollman@FreeBSD.org> |
Implemented loadable VFS modules, and made most existing filesystems loadable. (NFS is a notable exception.)
|
#
8fceb1ba |
|
02-Sep-1994 |
David Greenman <dg@FreeBSD.org> |
Disallow truncating to negative file sizes. Doing so causes ffs_truncate() and perhaps other fs truncate's to go crazy and panic the machine or worse. This fixes the truncate bug reported by Michael Class.
|
#
2fc62994 |
|
01-Sep-1994 |
David Greenman <dg@FreeBSD.org> |
Make olstat() consistent with lstat() - so they both return the same owner.. Submitted by: Kirk McKusick
|
#
e0e9c421 |
|
20-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Implemented filesystem clean bit via: machdep.c: Changed printf's a little and call vfs_unmountall() if the sync was successful. cd9660_vfsops.c, ffs_vfsops.c, nfs_vfsops.c, lfs_vfsops.c: Allow dismount of root FS. It is now disallowed at a higher level. vfs_conf.c: Removed unused rootfs global. vfs_subr.c: Added new routines vfs_unmountall and vfs_unmountroot. Filesystems are now dismounted if the machine is properly rebooted. ffs_vfsops.c: Toggle clean bit at the appropriate places. Print warning if an unclean FS is mounted. ffs_vfsops.c, lfs_vfsops.c: Fix bug in selecting proper flags for VOP_CLOSE(). vfs_syscalls.c: Disallow dismounting root FS via umount syscall.
|
#
3c4dd356 |
|
02-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Added $Id$
|
#
26f9a767 |
|
25-May-1994 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
df8bae1d |
|
24-May-1994 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
BSD 4.4 Lite Kernel Sources
|