#
6b0cf2a2 |
|
24-Apr-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
vfs_lookup.c: only call ktrcapfail() if KTRACE is enabled Reviewed by: emaste, imp, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D44931
|
#
66df8102 |
|
24-Apr-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
sys/namei.h: move NI_CAP_VIOLATION() macro from namei.h to vfs_lookup.c Reviewed by: emaste, imp, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D44931
|
#
0cd9cde7 |
|
06-Apr-2024 |
Jake Freeland <jfree@FreeBSD.org> |
ktrace: Record namei violations with KTR_CAPFAIL Report namei path lookups while Capsicum violation tracing with CAPFAIL_NAMEI. vfs caching is also ignored when tracing to mimic capability mode behavior. Reviewed by: markj Approved by: markj (mentor) MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D40680
|
#
55edc40e |
|
04-Jan-2024 |
Mark Johnston <markj@FreeBSD.org> |
file: Remove the fd parameter to fgetvp_lookup() and fgetvp_lookup_smr() The fd is always obtained from nameidata, so just fetch it from there instead. No functional change intended. Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D43257
|
#
29363fb4 |
|
23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags. Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
|
#
586fed0b |
|
04-Nov-2023 |
Jason A. Harmening <jah@FreeBSD.org> |
vfs_lookup_cross_mount(): restore previous do...while loop When the cross-mount walking logic in vfs_lookup() was factored into a separate function, the main cross-mount traversal loop was changed from a do...while loop conditional on the current vnode having VIRF_MOUNTPOINT set to an unconditional for(;;) loop. For the unionfs 'crosslock' case in which the vnode may be re-locked, this meant that continuing the loop upon finding inconsistent v_mountedhere state would no longer branch to a check that the vnode is in fact still a mountpoint. This would in turn lead to over- iteration and, for INVARIANTS builds, a failed assert on the next iteration. Fix this by restoring the previous loop behavior. Reported by: pho Tested by: pho Fixes: 80bd5ef0702562c546fa1717e8fe221058974eac MFC after: 1 week
|
#
02cbc029 |
|
22-Sep-2023 |
Olivier Certner <olce.freebsd@certner.fr> |
vfs: fix reference counting/locking on LK_UPGRADE error Factoring out this code unfortunately introduced reference and lock leaks in case of failure in the lock upgrade path under VV_CROSSLOCK. In terms of practical use, this impacts unionfs (and nullfs in a corner case). Fixes: 80bd5ef07025 ("vfs: factor out mount point traversal to a dedicated routine") MFC after: 3 days MFC to: stable/14 releng/14.0 Sponsored by: The FreeBSD Foundation Reviewed by: mjg [mjg: massaged the commit message a little bit] Differential Revision: https://reviews.freebsd.org/D41731
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
b8b33f3b |
|
09-Aug-2023 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: retire NAMEI_DIAGNOSTIC It is too spammy and information-deficient for practical use. Also see https://reviews.freebsd.org/D41207
|
#
80bd5ef0 |
|
05-Jul-2023 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: factor out mount point traversal to a dedicated routine While here tidy up asserts in the area. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D40883
|
#
ebf37c3f |
|
05-Jul-2023 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop LK_RETRY when crossing mount points in vfs_lookup vn_lock already returns the expected error. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D40883
|
#
0724cf38 |
|
05-Jul-2023 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: whack dpunlocked var in vfs_lookup It is redundant given the bad_unlocked goto label.
|
#
5842f73d |
|
05-Jul-2023 |
Olivier Certner <olce.freebsd@certner.fr> |
vfs: compute_lk_cnflags(): Remove unused argument 'cnflags'; Rename Argument unused since commit 93a0ba8f4990785f. Rename it to enforce_lkflags(), which seems to more aptly describe what it does. [mjg: massaged the commit message a little] Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D40848
|
#
7b5a1c39 |
|
27-Jun-2023 |
Igor Ostapenko <pm@igoro.pro> |
vfs: bring vfs_lookup() description comment up to date Signed-off-by: Igor Ostapenko <pm@igoro.pro> Reviewed by: imp, mhorne Pull Request: https://github.com/freebsd/freebsd-src/pull/737
|
#
5958cd88 |
|
27-Jun-2023 |
Igor Ostapenko <pm@igoro.pro> |
vfs: fix description comment of vfs_lookup() Signed-off-by: Igor Ostapenko <pm@igoro.pro> Reviewed by: imp, mhorne Pull Request: https://github.com/freebsd/freebsd-src/pull/737
|
#
cea7c564 |
|
13-Jun-2023 |
Dmitry Chagin <dchagin@FreeBSD.org> |
namei: Reset the lookup to start from the real root for abs symlink target Since fd745e1d Linux ABI specifies alternative root directory to reroot lookups. First, an attempt is made to lookup the file in /ABI/original-path. If that fails, the lookup is done in /original-path. In case of lookup symbolic link with leading / in target namei() fails due to reroot reloads original file name. To avoid this handle restart in a special maner, without origin path name reloading. Reported by: Goran Mekić, Vincent Milum Jr Tested by: Goran Mekić Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D40479
|
#
861abdad |
|
13-Jun-2023 |
Dmitry Chagin <dchagin@FreeBSD.org> |
namei: Add a comment explaining ISRESTARTED flag Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D40494
|
#
07c0b6e5 |
|
29-May-2023 |
Dmitry Chagin <dchagin@FreeBSD.org> |
vfs: Retire kern_alternate_path() as unused anymore From now a non-native ABI should use pwd_altroot() ability to tell to the namei() its root directory to dynamically reroots lookups. Differential Revision: https://reviews.freebsd.org/D40093 MFC after: 2 month
|
#
3d2fec7d |
|
29-May-2023 |
Dmitry Chagin <dchagin@FreeBSD.org> |
namei: Add the abilty for the ABI to specify an alternate root path For now a non-native ABI (i.e., Linux) uses the kern_alternate_path() facility to dynamically reroot lookups. First, an attempt is made to lookup the file in /compat/linux/original-path. If that fails, the lookup is done in /original-path. Thats requires a bit of code in every ABI syscall implementation where path name translation is needed. Also our kern_alternate_path() does not properly lookups absolute symlinks in second attempt, i.e., does not append /compat/linux part to the resolved link. The change is intended to avoid this by specifiyng the ABI root directory for namei(), using one call to pwd_altroot() during exec-time into the ABI. In that case namei() will dynamically reroot lookups as mentioned above. PR: 72920 Reviewed by: kib Differential revision: https://reviews.freebsd.org/D38933 MFC after: 2 month
|
#
cf0fc64b |
|
03-May-2023 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: reduce audit branching in namei_setup
|
#
a718431c |
|
23-Apr-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
lookup(): ensure that openat("/", "..", O_RESOLVE_BENEATH) fails PR: 269780 Reported by: Dan Gohman <dev@sunfishcode.online> Reviewed by: emaste, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39773
|
#
0c01203e |
|
28-Mar-2023 |
Jason A. Harmening <jah@FreeBSD.org> |
vfs_lookup(): re-check v_mountedhere on lock upgrade The VV_CROSSLOCK handling logic may need to upgrade the covered vnode lock depending upon the requirements of the filesystem into which vfs_lookup() is walking. This may involve transiently dropping the lock, which can allow the target mount to be unmounted. Tested by: pho Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D39272
|
#
7b6fe242 |
|
08-Apr-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
DEBUG_VFS_LOCKS: use witness if available The assert_vop_locked messages are ignored, and file/line information is not too useful. Fixing this without changing both witness and VFS asserts KPIs is not possible. Reviewed by: markj (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39464
|
#
829f0bcb |
|
19-Dec-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add the concept of vnode state transitions To quote from a comment above vput_final: <quote> * XXX Some filesystems pass in an exclusively locked vnode and strongly depend * on the lock being held all the way until VOP_INACTIVE. This in particular * happens with UFS which adds half-constructed vnodes to the hash, where they * can be found by other code. </quote> As is there is no mechanism which allows filesystems to denote that a vnode is fully initialized, consequently problems like the above are only found the hard way(tm). Add rudimentary support for state transitions, which in particular allow to assert the vnode is not legally unlocked until its fate is decided (either construction finishes or vgone is called to abort it). The new field lands in a 1-byte hole, thus it does not grow the struct. Bump __FreeBSD_version to 1400077 Reviewed by: kib (previous version) Tested by: pho Differential Revision: https://reviews.freebsd.org/D37759
|
#
8f7859e8 |
|
14-Dec-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: retire the now unused SAVESTART flag Bump __FreeBSD_version to 1400075 Tested by: pho
|
#
8f874e92 |
|
09-Nov-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: make relookup take an additional argument instead of looking at SAVESTART This is a step towards removing the flag. Reviewed by: mckusick Tested by: pho Differential Revision: https://reviews.freebsd.org/D34468
|
#
269c564b |
|
17-Nov-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: retire NDFREE There are no consumers anymore. Interested parties can NDFREE_PNBUF and vput or vrele relevant vnodes. Tested by: pho
|
#
42442d7a |
|
26-Oct-2022 |
Jason A. Harmening <jah@FreeBSD.org> |
Generalize the VV_CROSSLOCK logic in vfs_lookup() When VV_CROSSLOCK is present, the lock for the vnode at the current stage of lookup must be held across the VFS_ROOT() call for the filesystem mounted at the vnode. Since VV_CROSSLOCK implies that the root vnode reuses the already-held lock, the possibility for recursion should be made clear in the flags passed to VFS_ROOT(). For cases in which the lock is held exclusive, this means passing LK_CANRECURSE. For cases in which the lock is held shared, it means clearing LK_NODDLKTREAT to allow VFS_ROOT() to potentially recurse on the shared lock even in the presence of an exclusive waiter. That the existing code works for unionfs is due to a coincidence of the current unionfs implementation. Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D37458
|
#
f7833196 |
|
19-Oct-2022 |
Jason A. Harmening <jah@FreeBSD.org> |
vfs_lookup(): Minor performance optimizations Refactor the symlink and mountpoint traversal logic to avoid repeatedly checking the vnode type; a symlink cannot be a mountpoint and vice versa. Avoid repeatedly checking cn_flags for NOCROSSMOUNT and simplify the check which determines whether the vnode is a mountpoint. Suggested by: mjg Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D35054
|
#
706f15c5 |
|
04-Aug-2022 |
Jason A. Harmening <jah@FreeBSD.org> |
Remove witness directives from crossmp locking VOPs These are of limited use since the crossmp vnode locking ops have not actually used a lock since commit a2d35545429117e68fbcbc68e14ad55e84265d69. We in fact require that these operations are always issued with LK_SHARED. Additionally, these directives can produce a false positive in certain VV_CROSSLOCK cases which require upgrading of the covered vnode lock from shared to exclusive. While here, replace the runtime check of LK_SHARED with a KASSERT and expand the check to include LK_NOWAIT, which all callers pass. Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D35054
|
#
080ef8a4 |
|
04-Aug-2022 |
Jason A. Harmening <jah@FreeBSD.org> |
Add VV_CROSSLOCK vnode flag to avoid cross-mount lookup LOR When a lookup operation crosses into a new mountpoint, the mountpoint must first be busied before the root vnode can be locked. When a filesystem is unmounted, the vnode covered by the mountpoint must first be locked, and then the busy count for the mountpoint drained. Ordinarily, these two operations work fine if executed concurrently, but with a stacked filesystem the root vnode may in fact use the same lock as the covered vnode. By design, this will always be the case for unionfs (with either the upper or lower root vnode depending on mount options), and can also be the case for nullfs if the target and mount point are the same (which admittedly is very unlikely in practice). In this case, we have LOR. The lookup path holds the mountpoint busy while waiting on what is effectively the covered vnode lock, while a concurrent unmount holds the covered vnode lock and waits for the mountpoint's busy count to drain. Attempt to resolve this LOR by allowing the stacked filesystem to specify a new flag, VV_CROSSLOCK, on a covered vnode as necessary. Upon observing this flag, the vfs_lookup() will leave the covered vnode lock held while crossing into the mountpoint. Employ this flag for unionfs with the caveat that it can't be used for '-o below' mounts until other unionfs locking issues are resolved. Reported by: pho Tested by: pho Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D35054
|
#
b77bdfdb |
|
17-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fix non-INVARIANTS build after 5b5b7e2ca2fa9a2418dd51749f4ef6f881ae7179 Reported by: gj
|
#
5b5b7e2c |
|
17-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: always retain path buffer after lookup This removes some of the complexity needed to maintain HASBUF and allows for removing injecting SAVENAME by filesystems. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D36542
|
#
3df3d88c |
|
16-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: move cn_nameptr assignment out of namei_getpath
|
#
f7dc4a71 |
|
13-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: plug spurious error checks in namei error is guaranteed 0 at that point
|
#
b4137c9e |
|
12-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: make NDVALIDATE private to vfs_lookup.c it is not used elsewhere.
|
#
14312394 |
|
27-Apr-2022 |
John Baldwin <jhb@FreeBSD.org> |
Add a __witness_used for variables only used under #ifdef WITNESS. __diagused is now solely used for variables only used under INVARIANTS. Reviewed by: mjg Differential Revision: https://reviews.freebsd.org/D35085
|
#
c9b04ee4 |
|
02-Apr-2022 |
Gordon Bergling <gbe@FreeBSD.org> |
kern: Fix two typos in source code comments - s/accomodate/accommodate/ MFC after: 3 days
|
#
0c805718 |
|
24-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fix memory leak on lookup with fds with ioctl caps Reviewed by: markj PR: 262515 Noted by: firk@cantconnect.ru Differential Revision: https://reviews.freebsd.org/D34667
|
#
a4032e2a |
|
26-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: assorted tidy ups to lookup No functional changes.
|
#
0f600883 |
|
25-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: set cn_namelen when handling degenerate lookups Turns out execve looks at it to store binary name, but in order to trigger the problem one has to be trying to exec '/'. As is the value would be left uninitialized (or rather set to -1 on debug kernels). Fixes: 56244d35741a62e7 ("vfs: hoist degenerate path lookups out of the loop")
|
#
4ef6e56a |
|
24-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: hoist trailing slash handling out of the loop
|
#
3b6792d2 |
|
23-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: factor symlink traversal out of namei The intent down the road is to eliminate the loop to begin with, pushing traversal down to vfs_lookup, all while not allocating the extra buffer.
|
#
d9ea7e2b |
|
13-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: factor FAILIFEXISTS handling out of vfs_lookup
|
#
56244d35 |
|
11-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: hoist degenerate path lookups out of the loop
|
#
bb92cd7b |
|
24-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd)
|
#
93a0ba8f |
|
17-Sep-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: retire the no longer used MNTK_LOOKUP_EXCL_DOTDOT flag Reviewed by: markj Tested by: pho (previous version) Differential Revision: https://reviews.freebsd.org/D34466
|
#
0134bbe5 |
|
13-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: prefix lookup and relookup with vfs_ Reviewed by: imp, mckusick Differential Revision: https://reviews.freebsd.org/D34530
|
#
2cee5861 |
|
28-Dec-2021 |
John Baldwin <jhb@FreeBSD.org> |
sys/kern: Use C99 fixed-width integer types. No functional change. Reviewed by: imp, kib Differential Revision: https://reviews.freebsd.org/D33630
|
#
054f5815 |
|
25-Nov-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: plug a set-but-not-used var in kern_alternate_path Sponsored by: Rubicon Communications, LLC ("Netgate")
|
#
7e1d3eef |
|
25-Nov-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove the unused thread argument from NDINIT* See b4a58fbf640409a1 ("vfs: remove cn_thread") Bump __FreeBSD_version to 1400043.
|
#
c40fee6f |
|
25-Nov-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the always curthread argument from kern_alternate_path
|
#
7dd419ca |
|
26-Sep-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
cache: add empty path support This avoids spurious drop offs as EMPTY is passed regardless of the actual path name. Pushign the work inside the lookup instead of just ignorign the flag allows avoid checking for empty pathname for all other lookups.
|
#
46dd801a |
|
16-Oct-2021 |
Colin Percival <cperciva@FreeBSD.org> |
Add userland boot profiling to TSLOG On kernels compiled with 'options TSLOG', record for each process ID: * The timestamp of the fork() which creates it and the parent process ID, * The first path passed to execve(), if any, * The first path resolved by namei, if any, and * The timestamp of the exit() which terminates the process. Expose this information via a new sysctl, debug.tslog_user. On kernels lacking 'options TSLOG' (the default), no information is recorded and the sysctl does not exist. Note that recording namei is needed in order to obtain the names of rc.d scripts being launched, as the rc system sources them in a subshell rather than execing the scripts. With this commit it is now possible to generate flamecharts of the entire boot process from the start of the loader to the end of /etc/rc. The code needed to perform this processing is currently found in github: https://github.com/cperciva/freebsd-boot-profiling Reviewed by: mhorne Sponsored by: https://www.patreon.com/cperciva Differential Revision: https://reviews.freebsd.org/D32493
|
#
b4a58fbf |
|
01-Oct-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove cn_thread It is always curthread. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D32453
|
#
c9536389 |
|
01-Oct-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: hoist cn_thread assert in namei Making it condtional on whether ktrace happens to be enabled makes no sense.
|
#
7fd856ba |
|
22-Aug-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: s/__unused/__diagused in crossmp_* Sponsored by: Rubicon Communications, LLC ("Netgate")
|
#
5d75ffdd |
|
20-Aug-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove an unused variable from nameicap_tracker_add Reported by cc --analyze Sponsored by: Rubicon Communications, LLC ("Netgate")
|
#
9446d9e8 |
|
13-Aug-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
fstatat(2): handle non-vnode file descriptors for AT_EMPTY_PATH Set NIRES_EMPTYPATH earlies, to have use of EMPTYPATH recorded even if we are going to return error. When namei_setup() refused to accept dirfd, which is not of the vnode type, and indicated by ENOTDIR error return, fall back to kern_fstat(dirfd). Reported by: dchagin Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31530
|
#
d81aefa8 |
|
25-May-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: use the sentinel trick in locked lookup path parsing
|
#
cef8a95a |
|
13-May-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fix vnode use count leak in O_EMPTY_PATH support The vnode returned by namei_setup is already referenced. Reported by: pho
|
#
a5970a52 |
|
03-Apr-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Make files opened with O_PATH to not block non-forced unmount by only keeping hold count on the vnode, instead of the use count. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323
|
#
8d9ed174 |
|
17-Mar-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
open(2): Implement O_PATH Reviewed by: markj Tested by: pho Discussed with: walker.aj325_gmail.com, wulf Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323
|
#
509124b6 |
|
07-Mar-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Add AT_EMPTY_PATH for several *at(2) syscalls It is currently allowed to fchownat(2), fchmodat(2), fchflagsat(2), utimensat(2), fstatat(2), and linkat(2). For linkat(2), PRIV_VFS_FHOPEN privilege is required to exercise the flag. It allows to link any open file. Requested by: trasz Tested by: pho, trasz Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D29111
|
#
71c160a8 |
|
27-Mar-2021 |
Mark Johnston <markj@FreeBSD.org> |
vfs: Add an assertion around name length limits Some filesystems assume that they can copy a name component, with length bounded by NAME_MAX, into a dirent buffer of size MAXNAMLEN. These constants have the same value; add a compile-time assertion to that effect. Reported by: Alexey Kulaev <alex.qart@gmail.com> Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29431
|
#
28cd3a67 |
|
28-Feb-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
O_RELATIVE_BENEATH: return ENOTCAPABLE instead of EINVAL for abs path Requested and reviewed by: markj Tested by: arichardson, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D28907
|
#
49c98a4b |
|
27-Feb-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
nameicap_check_dotdot: trim tracker on check Tracker should contain exactly the path from the starting directory to the current lookup point. Otherwise we might not detect some cases of dotdot escape. Consequently, if we are walking up the tree by dotdot lookup, we must remove an entries below the walked directory. Reviewed by: markj Tested by: arichardson, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D28907
|
#
e8a2862a |
|
27-Feb-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Add nameicap_cleanup_from(), to clean tracker list starting from some element Reviewed by: markj Tested by: arichardson, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D28907
|
#
2388ad7c |
|
27-Feb-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
nameicap_tracker_add: avoid duplicates in the tracker list Reviewed by: markj Tested by: arichardson, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D28907
|
#
59e74942 |
|
27-Feb-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not call nameicap_tracker_add() for dotdot case. Reviewed by: markj Tested by: arichardson, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D28907
|
#
20e91ca3 |
|
15-Feb-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
open(2): Remove O_BENEATH and AT_BENEATH with the reasoning that the flags did not worked properly, and were not shipped in a release. O_RESOLVE_BENEATH is kept as useful. Reviewed by: markj Tested by: arichardson, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D28907
|
#
739ecbcf |
|
23-Jan-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
cache: add symlink support to lockless lookup Reviewed by: kib (previous version) Tested by: pho (previous version) Differential Revision: https://reviews.freebsd.org/D27488
|
#
70ba7770 |
|
12-Jan-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: extend vfs:namei:lookup:return probe with nameidata
|
#
cdb62ab7 |
|
11-Jan-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add NDFREE_NOTHING and convert several NDFREE_PNBUF callers Check the comment above the routine for reasoning.
|
#
002e18eb |
|
27-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add FAILIFEXISTS flag Both FreeBSD and Linux mkdir -p walk the tree up ignoring any EEXIST on the way and both are used a lot when building respective kernels. This poses a problem as spurious locking avoidably interferes with concurrent operations like getdirentries on affected directories. Work around the problem by adding FAILIFEXISTS flag. In case of lockless lookup this manages to avoid any work to begin with, there is no speed up for the locked case but perhaps this can be augmented later on. For simplicity the only supported semantics are as used by mkdir. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D27789
|
#
8fcfd0e2 |
|
06-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add cleanup on error missed in r368375 Noted by: jrtc27
|
#
60e2a0d9 |
|
05-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: factor buffer allocation/copyin out of namei
|
#
9c8c797c |
|
22-Nov-2020 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Remove the 'wantparent' variable, unused since r145004. Reviewed by: kib MFC after: 2 weeks Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D27193
|
#
2fbb45c6 |
|
04-Nov-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: change nt_zone into a malloc type Elements are small in size and allocated for short periods.
|
#
62568e88 |
|
29-Oct-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add NAMEI_DBG_HADSTARTDIR handling lost in rewrite Noted by: rpokala
|
#
eebc2e45 |
|
28-Oct-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add NDREINIT to facilitate repeated namei calls struct nameidata mixes caller arguments, internal state and output, which can be quite error prone. Recent addition of valdiating ni_resflags uncovered a caller which could repeatedly call namei, effectively operating on partially populated state. Add bare minimium validation this does not happen. The real fix would decouple aforementioned state. Reported by: pho Tested by: pho (different variant)
|
#
d681c51d |
|
26-Oct-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
cache: add missing NIRES_ABS handling
|
#
4ea49660 |
|
08-Oct-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not allow to use O_BENEATH as an oracle. Specifically, if lookup() returned any error and the topping directory was not latched, which means that (non-existent) path did not returned to the topping location, give ENOTCAPABLE a priority over the lookup() error. PR: 249960 Reviewed by: emaste, ngie Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D26695
|
#
1317da43 |
|
22-Sep-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Add O_RESOLVE_BENEATH and AT_RESOLVE_BENEATH to mimic Linux' RESOLVE_BENEATH. It is like O_BENEATH, but disables to walk out of the subtree rooted in the starting directory. O_BENEATH does not care if path walks out if it returned. Requested by: Dan Gohman <dev@sunfishcode.online> PR: 248335 Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25886
|
#
6a9c72d9 |
|
22-Sep-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Change O_BENEATH to handle relative paths same as absolute. Do not care if path walks out of the topping directory if it returns back. Requested and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25886
|
#
07e7ad2b |
|
22-Sep-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Only clear latch for BENEATH when we walk out of the startdir, not unconditionally on any dotdot component. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25886
|
#
c7de3d6f |
|
22-Sep-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Add NIRES_STRICTREL. Stop abusing internal namei flag NI_LCF_STRICTRELATIVE as indicator of cap-restricted lookup. Add designated returned flag NIRES_STRICTREL to inform kern_openat() that lookup was restricted. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25886
|
#
f9e46c9b |
|
22-Sep-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
lookup: Track last lookup component if it is directory. This makes open("/a/../a", O_BENEATH) with cwd == "/a" work. Reviewed by: markj Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25886
|
#
44619a5e |
|
22-Sep-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Improve comment above nameicap_check_dotdot(). Explain why tracker is needed at all. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25886
|
#
66ac5b2c |
|
27-Aug-2020 |
Kirk McKusick <mckusick@FreeBSD.org> |
Add a comment to clarify when and why cached names are deleted during pathname lookup. Reviewed by: kib MFC after: 3 days Sponsored by: Netflix
|
#
f0d9c77e |
|
23-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: validate ndp state after the lookup The intent is to remove known-to-be-nops NDFREE calls after many lookups.
|
#
4b500119 |
|
23-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: convert nameiop into an enum While here change the field size from long to int and move it into the gap next to cn_flags. Shrinks struct componentname from 64 to 56 bytes on amd64.
|
#
de0fcd3a |
|
22-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: assert that HASBUF is only set with SAVENAME or SAVESTART as requested by the caller. The intent is to eradicate the mostly spurious NDFREE_PNBUF calls.
|
#
760a430b |
|
22-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add a work around for vp_crossmp bug to realpath The actual bug is not yet addressed as it will get much easier after other problems are addressed (most notably rename contract). The only affected in-tree consumer is realpath. Everyone else happens to be performing lookups within a mount point, having a side effect of ni_dvp being set to mount point's root vnode in the worst case. Reported by: pho
|
#
494c0f2a |
|
16-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: mark HASBUF as an internal flag There is no setter for cn_pnbuf.
|
#
b38ad268 |
|
13-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add missing pwd_drop on error in namei_setup Reported by: pho
|
#
2d0631dd |
|
10-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: stricter validation for flags passed to namei in cn_flags namei de facto expects that the naimeidata object is properly initialized, but at the same time it mixes consumer-passable and internal flags, while tolerating this part by explicitly clearing some of them. Tighten the interface instead. While here renumber the flags and denote the gap between the 2 variants. Try to piggy back th renumber on the just bumped __FreeBSD_version.
|
#
25e42ee2 |
|
10-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the hello world stat probes from the vfs provider Interested parties can get the same information by hoooking on vop_stat.
|
#
7f700801 |
|
10-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: disallow NOCACHE with LOOKUP This means there is no expectation lookup will purge the terminal entry, which simplifies lockless lookup. Tested by: pho Sponsored by: The FreeBSD Foundation
|
#
158ab70c |
|
05-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: tidy up namei entry point - predict for string copy errors - reshuffle inititalistion of vars which are not needed
|
#
85cf3161 |
|
01-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: inline NDINIT_ALL The routine takes more than 6 arguments, which on amd64 means some of them have to be passed through the stack.
|
#
14576629 |
|
01-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: convert ni_rigthsneeded to a pointer Shaves 8 bytes of struct nameidata on 64-bit platforms.
|
#
21c16260 |
|
01-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: make rights mandatory for NDINIT_ALL
|
#
b1f910e0 |
|
30-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: short-circuit the common case NDFREE calls Almost all consumers use the NDF_ONLY_PNBUF macro, making them avoidably branch a lot in the NDFREE routine. Also note most of them should not need to call any cleanup anyway as they don't request HASBUF.
|
#
d3e63e8e |
|
30-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: make sure startdir_used is always assigned to before use CID: 1431070
|
#
c42b77e6 |
|
25-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: lockless lookup Provides full scalability as long as all visited filesystems support the lookup and terminal vnodes are different. Inner workings are explained in the comment above cache_fplookup. Capabilities and fd-relative lookups are not supported and will result in immediate fallback to regular code. Symlinks, ".." in the path, mount points without support for lockless lookup and mismatched counters will result in an attempt to get a reference to the directory vnode and continue in regular lookup. If this fails, the entire operation is aborted and regular lookup starts from scratch. However, care is taken that data is not copied again from userspace. Sample benchmark: incremental -j 104 bzImage on tmpfs: before: 142.96s user 1025.63s system 4924% cpu 23.731 total after: 147.36s user 313.40s system 3216% cpu 14.326 total Sample microbenchmark: access calls to separate files in /tmpfs, 104 workers, ops/s: before: 2165816 after: 151216530 Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25578
|
#
422f38d8 |
|
10-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fix trivial whitespace issues which don't interefere with blame .. even without the -w switch
|
#
2f423bce |
|
01-Mar-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: stop taking additional refs on root vnode during lookup They are spurious since introduction of struct pwd, which provides them implicitly. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23885
|
#
8d03b99b |
|
01-Mar-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: move vnodes out of filedesc into a dedicated structure The new structure is copy-on-write. With the assumption that path lookups are significantly more frequent than chdirs and chrooting this is a win. This provides stable root and jail root vnodes without the need to reference them on lookup, which in turn means less work on globally shared structures. Note this also happens to fix a bug where jail vnode was never referenced, meaning subsequent access on lookup could run into use-after-free. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23884
|
#
721a81c3 |
|
20-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: stop duplicating vnode work in audit during path lookup Duplicating the work was putting an avoidable requirement that the filedesc lock is held across the entire operation (otherwise by the time audit reads vnode pointers another thread in the same process can chdir somewhere else, making audit log things using different vnode than the one which will be used for actual lookup). Do the obvious thing and pass down vnodes which will be used.
|
#
e126c5a3 |
|
14-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: use new capsicum helpers
|
#
6ebab6ba |
|
13-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: use mac fastpath for lookup, open, read, write, mmap
|
#
3d62f685 |
|
03-Feb-2020 |
Kyle Evans <kevans@FreeBSD.org> |
namei: preserve errors from fget_cap_locked Most notably, we want to make sure we don't clobber any capabilities-related errors. This is a regression from r357412 (O_SEARCH) that was picked up by the capsicum tests. PR: 243839 Reviewed by: kib (committed form recommended by) Tested by: lwhsu Differential Revision: https://reviews.freebsd.org/D23479
|
#
6a5abb1e |
|
02-Feb-2020 |
Kyle Evans <kevans@FreeBSD.org> |
Provide O_SEARCH O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping permissions checks on the directory itself after the initial open(). This is close to the semantics we've historically applied for O_EXEC on a directory, which is UB according to POSIX. Conveniently, O_SEARCH on a file is also explicitly undefined behavior according to POSIX, so O_EXEC would be a fine choice. The spec goes on to state that O_SEARCH and O_EXEC need not be distinct values, but they're not defined to be the same value. This was pointed out as an incompatibility with other systems that had made its way into libarchive, which had assumed that O_EXEC was an alias for O_SEARCH. This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a directory is checked in vn_open_vnode already, so for completeness we add a NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not re-check that when descending in namei. [0] https://pubs.opengroup.org/onlinepubs/9699919799/ Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23247
|
#
90f4ec33 |
|
31-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: save on atomics on the root vnode for absolute lookups There are 2 back-to-back atomics on the vnode, but we can check upfront if one is sufficient. Similarly we can handle relative lookups where current working directory == root directory. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23427
|
#
b249ce48 |
|
03-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427
|
#
6fa079fc |
|
15-Dec-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: flatten vop vectors This eliminates the following loop from all VOP calls: while(vop != NULL && \ vop->vop_spare2 == NULL && vop->vop_bypass == NULL) vop = vop->vop_default; Reviewed by: jeff Tesetd by: pho Differential Revision: https://reviews.freebsd.org/D22738
|
#
abd80ddb |
|
08-Dec-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: introduce v_irflag and make v_type smaller The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715
|
#
e28fa55a |
|
21-May-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
NDFREE(): Fix unlocking for LOCKPARENT|LOCKLEAF and ndp->ni_dvp == ndp->ni_vp. NDFREE() calculates unlock_dvp after ndp->ni_vp is unlocked and zeroed out. This makes the comparision of ni_dvp with ni_vp always fail. Move the calculation of unlock_dvp right after unlock_vp, so that the code sees correct ni_vp value. Reproduced by chdir("/usr"); open("/..", O_BENEATH | O_RDONLY); Reported by: syzkaller Reviewed by: markj, mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D20304
|
#
7cdb0b9d |
|
07-Feb-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix renameat(2) for CAPABILITIES kernels. When renameat(2) is used with: - absolute path for to; - tofd not set to AT_FDCWD; - the target exists kern_renameat() requires CAP_UNLINK capability on tofd, but corresponding namei ni_filecap is not initialized at all because the lookup is absolute. As result, the check was done against empty filecap and syscall fails erronously. Fix it by creating a return flags namei member and reporting if the lookup was absolute, then do not touch to.ni_filecaps at all. PR: 222258 Reviewed by: jilles, ngie Sponsored by: The FreeBSD Foundation MFC after: 1 week X-MFC-note: KBI breakage Differential revision: https://reviews.freebsd.org/D19096
|
#
24d64be4 |
|
13-Dec-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: mostly depessimize NDINIT_ALL 1) filecaps_init was unnecesarily a function call 2) an asignment at the end was preventing tail calling of cap_rights_init Sponsored by: The FreeBSD Foundation
|
#
7d2b0bd7 |
|
29-Nov-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
If BENEATH is specified, always latch the topping directory vnode. It is possible that we started with a relative path but during the lookup, found an absolute symlink. In this case, BENEATH handling code needs the latch, but it is too late to calculate it. While there, somewhat improve the assertions. Clear the NI_LCF_LATCH flag when the latch vnode is released, so that asserts know the state. Assert that there is a latch if we entered beneath+abs path mode, after the starting point is processed. Reported by: wulf With more input from: pho Sponsored by: The FreeBSD Foundation
|
#
ade85c5e |
|
10-Nov-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Allow absolute paths for O_BENEATH. The path must have a tail which does not escape starting/topping directory. The documentation will come shortly, see the man pages commit message for the reason of separate commit. Reviewed by: jilles (previous version) Discussed with: emaste Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D17714
|
#
4f77f488 |
|
25-Oct-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement O_BENEATH and AT_BENEATH. Flags prevent open(2) and *at(2) vfs syscalls name lookup from escaping the starting directory. Supposedly the interface is similar to the same proposed Linux flags. Reviewed by: jilles (code, previous version of manpages), 0mp (manpages) Discussed with: allanjude, emaste, jonathan Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D17547
|
#
c396945b |
|
20-Sep-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove lookup_shared tunable Reviewed by: kib, jhb Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D17253
|
#
84482abd |
|
18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
vfs: annotate variables only used by debug builds as __unused
|
#
51369649 |
|
20-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
|
#
38d84d68 |
|
17-Nov-2017 |
Conrad Meyer <cem@FreeBSD.org> |
vfs_lookup: Allow PATH_MAX-1 symlinks Previously, symlinks in FreeBSD were artificially limited to PATH_MAX-2. Add a short test case to verify the change. Submitted by: Gaurav Gangalwar <ggangalwar AT isilon.com> Reviewed by: kib Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12589
|
#
11ce4d9f |
|
15-Oct-2017 |
Tijl Coosemans <tijl@FreeBSD.org> |
When a Linux program tries to access a /path the kernel tries /compat/linux/path before /path. Stop following symbolic links when looking up /compat/linux/path so dead symbolic links aren't ignored. This allows syscalls like readlink(2) and lstat(2) to work on such links. And open(2) will return an error now instead of trying /path.
|
#
03f7f178 |
|
15-Mar-2017 |
John Baldwin <jhb@FreeBSD.org> |
Use UMA_ALIGN_PTR instead of sizeof(void *) for zone alignment. uma_zcreate()'s alignment argument is supposed to be sizeof(foo) - 1, and uma.h provides a set of helper macros for common types. Passing sizeof(void *) results in all of the members being misaligned triggering unaligned access faults on certain architectures (notably MIPS). Reported by: brooks Obtained from: CheriBSD MFC after: 3 days Sponsored by: DARPA / AFRL
|
#
aec8391d |
|
22-Jan-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Provide fallback VOP methods for crossmp vnode. In particular, crossmp vnode might leak into rename code. PR: 216380 Reported by: fnacl@protonmail.com Sponsored by: The FreeBSD Foundation X-MFC with: r309425
|
#
5ec7cde4 |
|
04-Jan-2017 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Fix bug that would result in a kernel crash in some cases involving a symlink and an autofs mount request. The crash was caused by namei() calling bcopy() with a negative length, caused by numeric underflow: in lookup(), in the relookup path, the ni_pathlen was decremented too many times. The bug was introduced in r296715. Big thanks to Alex Deiter for his help with debugging this. Reviewed by: kib@ Tested by: Alex Deiter <alex.deiter at gmail.com> MFC after: 1 month
|
#
5afb134c |
|
12-Dec-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add vrefact, to be used when the vnode has to be already active This allows blind increment of relevant counters which under contention is cheaper than inc-not-zero loops at least on amd64. Use it in some of the places which are guaranteed to see already active vnodes. Reviewed by: kib (previous version)
|
#
778aa66a |
|
12-Dec-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Enable lookup_cap_dotdot and lookup_cap_dotdot_nonlocal. Requested and reviewed by: cem Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D8746
|
#
a2d35545 |
|
02-Dec-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: provide fake locking primitives for the crossmp vnode Since the vnode is only expected to be shared locked, we can save a little overhead by only pretending we are locking in the first place. Reviewed by: kib Tested by: pho
|
#
a4ce25b5 |
|
29-Nov-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fix a whitespace nit in r309307
|
#
1babea03 |
|
29-Nov-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: avoid VOP_ISLOCKED in the common case in lookup
|
#
7359fdcf |
|
01-Nov-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Allow some dotdot lookups in capability mode. If dotdot lookup does not escape from the file descriptor passed as the lookup root, we can allow the component traversal. Track the directories traversed, and check the result of dotdot lookup against the recorded list of the directory vnodes. Dotdot lookups are enabled by sysctl vfs.lookup_cap_dotdot, currently disabled by default until more verification of the approach is done. Disallow non-local filesystems for dotdot, since remote server might conspire with the local process to allow it to escape the namespace. This might be too cautious, provide the knob vfs.lookup_cap_dotdot_nonlocal to override as well. Idea by: rwatson Discussed with: emaste, jonathan, rwatson Reviewed by: mjg (previous version) Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 week Differential revision: https://reviews.freebsd.org/D8110
|
#
1bf6a090 |
|
01-Nov-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove tautological casts. Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
ec846935 |
|
01-Nov-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Style fixes. Discussed with: emaste Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
69a28758 |
|
15-Sep-2016 |
Ed Maste <emaste@FreeBSD.org> |
Renumber license clauses in sys/kern to avoid skipping #3
|
#
11d3ad2e |
|
27-Aug-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: provide a common exit point in namei for error cases This shortens the function, adds the SDT_PROBE use for error cases and consistenly unrefs rootdir last. Reviewed by: kib MFC after: 2 weeks
|
#
411455a8 |
|
10-Aug-2016 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Replace all remaining calls to vprint(9) with vn_printf(9), and remove the old macro. MFC after: 1 month
|
#
e3043798 |
|
29-Apr-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/kern: spelling fixes in comments. No functional change.
|
#
b85f65af |
|
15-Apr-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
kern: for pointers replace 0 with NULL. These are mostly cosmetical, no functional change. Found with devel/coccinelle.
|
#
2a5a08cb |
|
12-Mar-2016 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Refactor the way we restore cn_lkflags; no functional changes. MFC after: 1 month Sponsored by: The FreeBSD Foundation
|
#
f69db551 |
|
12-Mar-2016 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Remove cn_consume from 'struct componentname'. It was never set to anything other than 0. Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5611
|
#
213ed838 |
|
12-Mar-2016 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Fix autofs triggering problem. Assume you have an NFS server, 192.168.1.1, with share "share". This commit fixes a problem where "mkdir /net/192.168.1.1/share/meh" would return spurious error instead of creating the directory if the target filesystem wasn't mounted yet; subsequent attempts would work correctly. The failure scenario is kind of complicated to explain, but it all boils down to calling VOP_MKDIR() for the target filesystem (NFS) with wrong dvp - the autofs vnode instead of the filesystem root mounted over it. Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5442
|
#
2f2f522b |
|
27-Sep-2015 |
Andriy Gapon <avg@FreeBSD.org> |
save some bytes by using more concise SDT_PROBE<n> instead of SDT_PROBE SDT_PROBE requires 5 parameters whereas SDT_PROBE<n> requires n parameters where n is typically smaller than 5. Perhaps SDT_PROBE should be made a private implementation detail. MFC after: 20 days
|
#
8c43e4cc |
|
12-Aug-2015 |
Ed Schouten <ed@FreeBSD.org> |
Properly return ENOTDIR when calling *at() on a non-vnode. We already properly return ENOTDIR when calling *at() on a non-directory vnode, but it turns out that if you call it on a socket, we see EINVAL. Patch up namei to properly translate this to ENOTDIR.
|
#
318b9463 |
|
09-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: cosmetic changes to namei and namei_handle_root - don't initialize cnp during declaration - don't test error/!error, compare to 0 instead
|
#
d177f49f |
|
09-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: simplify error handling in namei The logic is reorganised so that there is one exit point prior to the lookup loop. This is an intermediate step to making audit logging functions use found vnode instead of translating ni_dirfd on their own. ni_startdir validation is removed. The only in-tree consumer is nfs which already makes sure it is a directory. Reviewed by: kib
|
#
d19ba50e |
|
09-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: avoid spurious vref/vrele for absolute lookups namei used to vref fd_cdir, which was immediatley vrele'd on entry to the loop. Check for absolute lookup and vref the right vnode the first time. Reviewed by: kib
|
#
a03f1b29 |
|
09-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: plug a use-after-free of fd_rdir in namei fd_rdir vnode was stored in ni_rootdir without refing it in any way, after which the filedsc lock was being dropped. The vnode could have been freed by mountcheckdirs or another thread doing chroot. VREF the vnode while the lock is held. Reviewed by: kib MFC after: 1 week
|
#
947401dd |
|
05-Jul-2015 |
Mark Johnston <markj@FreeBSD.org> |
Move the comment describing namei(9) back to namei()'s definition. MFC after: 3 days
|
#
72ba3c08 |
|
02-Nov-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix two issues with lockmgr(9) LK_CAN_SHARE() test, which determines whether the shared request for already shared-locked lock could be granted. Both problems result in the exclusive locker starvation. The concurrent exclusive request is indicated by either LK_EXCLUSIVE_WAITERS or LK_EXCLUSIVE_SPINNERS flags. The reverse condition, i.e. no exclusive waiters, must check that both flags are cleared. Add a flag LK_NODDLKTREAT for shared lock request to indicate that current thread guarantees that it does not own the lock in shared mode. This turns back the exclusive lock starvation avoidance code; see man page update for detailed description. Use LK_NODDLKTREAT when doing lookup(9). Reported and tested by: pho No objections from: attilio Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
8cc11167 |
|
23-Aug-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Plug a memory leak in case of failed lookups in capability mode. Put common cnp cleanup into one function and use it for this purpose. MFC after: 1 week
|
#
af3b2549 |
|
27-Jun-2014 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Pull in r267961 and r267973 again. Fix for issues reported will follow.
|
#
37a107a4 |
|
27-Jun-2014 |
Glen Barber <gjb@FreeBSD.org> |
Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
|
#
3da1cf1e |
|
27-Jun-2014 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies
|
#
4a144410 |
|
16-Mar-2014 |
Robert Watson <rwatson@FreeBSD.org> |
Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. MFC after: 3 weeks
|
#
d9fae5ab |
|
26-Nov-2013 |
Andriy Gapon <avg@FreeBSD.org> |
dtrace sdt: remove the ugly sname parameter of SDT_PROBE_DEFINE In its stead use the Solaris / illumos approach of emulating '-' (dash) in probe names with '__' (two consecutive underscores). Reviewed by: markj MFC after: 3 weeks
|
#
54366c0b |
|
25-Nov-2013 |
Attilio Rao <attilio@FreeBSD.org> |
- For kernel compiled only with KDTRACE_HOOKS and not any lock debugging option, unbreak the lock tracing release semantic by embedding calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined version of the releasing functions for mutex, rwlock and sxlock. Failing to do so skips the lockstat_probe_func invokation for unlocking. - As part of the LOCKSTAT support is inlined in mutex operation, for kernel compiled without lock debugging options, potentially every consumer must be compiled including opt_kdtrace.h. Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES is linked there and it is only used as a compile-time stub [0]. [0] immediately shows some new bug as DTRACE-derived support for debug in sfxge is broken and it was never really tested. As it was not including correctly opt_kdtrace.h before it was never enabled so it was kept broken for a while. Fix this by using a protection stub, leaving sfxge driver authors the responsibility for fixing it appropriately [1]. Sponsored by: EMC / Isilon storage division Discussed with: rstone [0] Reported by: rstone [1] Discussed with: philip
|
#
6272798a |
|
09-Nov-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Both vn_close() and VFS_PROLOGUE() evaluate vp->v_mount twice, without holding the vnode lock; vp->v_mount is checked first for NULL equiality, and then dereferenced if not NULL. If vnode is reclaimed meantime, second dereference would still give NULL. Change VFS_PROLOGUE() to evaluate the mp once, convert MNTK_SHARED_WRITES and MNTK_EXTENDED_SHARED tests into inline functions. Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
3fded357 |
|
18-Sep-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Fix panic in ktrcapfail() when no capability rights are passed. While here, correct all consumers to pass NULL instead of 0 as we pass capability rights as pointers now, not uint64_t. Reported by: Daniel Peyrolon Tested by: Daniel Peyrolon Approved by: re (marius)
|
#
7008be5b |
|
04-Sep-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...); bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation
|
#
456597e7 |
|
05-Aug-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not override the ENOENT error for the empty path, or EFAULT errors from copyins, with the relative lookup check. Discussed with: rwatson Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
c686ee46 |
|
01-Apr-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not call the VOP_LOOKUP() for the doomed directory vnode. The vnode could be reclaimed while lock upgrade was performed. Sponsored by: The FreeBSD Foundation Reported and tested by: pho Diagnosed and reviewed by: rmacklem MFC after: 1 week
|
#
2609222a |
|
01-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Merge Capsicum overhaul: - Capability is no longer separate descriptor type. Now every descriptor has set of its own capability rights. - The cap_new(2) system call is left, but it is no longer documented and should not be used in new code. - The new syscall cap_rights_limit(2) should be used instead of cap_new(2), which limits capability rights of the given descriptor without creating a new one. - The cap_getrights(2) syscall is renamed to cap_rights_get(2). - If CAP_IOCTL capability right is present we can further reduce allowed ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed ioctls can be retrived with cap_ioctls_get(2) syscall. - If CAP_FCNTL capability right is present we can further reduce fcntls that can be used with the new cap_fcntls_limit(2) syscall and retrive them with cap_fcntls_get(2). - To support ioctl and fcntl white-listing the filedesc structure was heavly modified. - The audit subsystem, kdump and procstat tools were updated to recognize new syscalls. - Capability rights were revised and eventhough I tried hard to provide backward API and ABI compatibility there are some incompatible changes that are described in detail below: CAP_CREATE old behaviour: - Allow for openat(2)+O_CREAT. - Allow for linkat(2). - Allow for symlinkat(2). CAP_CREATE new behaviour: - Allow for openat(2)+O_CREAT. Added CAP_LINKAT: - Allow for linkat(2). ABI: Reuses CAP_RMDIR bit. - Allow to be target for renameat(2). Added CAP_SYMLINKAT: - Allow for symlinkat(2). Removed CAP_DELETE. Old behaviour: - Allow for unlinkat(2) when removing non-directory object. - Allow to be source for renameat(2). Removed CAP_RMDIR. Old behaviour: - Allow for unlinkat(2) when removing directory. Added CAP_RENAMEAT: - Required for source directory for the renameat(2) syscall. Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR): - Allow for unlinkat(2) on any object. - Required if target of renameat(2) exists and will be removed by this call. Removed CAP_MAPEXEC. CAP_MMAP old behaviour: - Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and PROT_WRITE. CAP_MMAP new behaviour: - Allow for mmap(2)+PROT_NONE. Added CAP_MMAP_R: - Allow for mmap(PROT_READ). Added CAP_MMAP_W: - Allow for mmap(PROT_WRITE). Added CAP_MMAP_X: - Allow for mmap(PROT_EXEC). Added CAP_MMAP_RW: - Allow for mmap(PROT_READ | PROT_WRITE). Added CAP_MMAP_RX: - Allow for mmap(PROT_READ | PROT_EXEC). Added CAP_MMAP_WX: - Allow for mmap(PROT_WRITE | PROT_EXEC). Added CAP_MMAP_RWX: - Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC). Renamed CAP_MKDIR to CAP_MKDIRAT. Renamed CAP_MKFIFO to CAP_MKFIFOAT. Renamed CAP_MKNODE to CAP_MKNODEAT. CAP_READ old behaviour: - Allow pread(2). - Disallow read(2), readv(2) (if there is no CAP_SEEK). CAP_READ new behaviour: - Allow read(2), readv(2). - Disallow pread(2) (CAP_SEEK was also required). CAP_WRITE old behaviour: - Allow pwrite(2). - Disallow write(2), writev(2) (if there is no CAP_SEEK). CAP_WRITE new behaviour: - Allow write(2), writev(2). - Disallow pwrite(2) (CAP_SEEK was also required). Added convinient defines: #define CAP_PREAD (CAP_SEEK | CAP_READ) #define CAP_PWRITE (CAP_SEEK | CAP_WRITE) #define CAP_MMAP_R (CAP_MMAP | CAP_SEEK | CAP_READ) #define CAP_MMAP_W (CAP_MMAP | CAP_SEEK | CAP_WRITE) #define CAP_MMAP_X (CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL) #define CAP_MMAP_RW (CAP_MMAP_R | CAP_MMAP_W) #define CAP_MMAP_RX (CAP_MMAP_R | CAP_MMAP_X) #define CAP_MMAP_WX (CAP_MMAP_W | CAP_MMAP_X) #define CAP_MMAP_RWX (CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X) #define CAP_RECV CAP_READ #define CAP_SEND CAP_WRITE #define CAP_SOCK_CLIENT \ (CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \ CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN) #define CAP_SOCK_SERVER \ (CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \ CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \ CAP_SETSOCKOPT | CAP_SHUTDOWN) Added defines for backward API compatibility: #define CAP_MAPEXEC CAP_MMAP_X #define CAP_DELETE CAP_UNLINKAT #define CAP_MKDIR CAP_MKDIRAT #define CAP_RMDIR CAP_UNLINKAT #define CAP_MKFIFO CAP_MKFIFOAT #define CAP_MKNOD CAP_MKNODAT #define CAP_SOCK_ALL (CAP_SOCK_CLIENT | CAP_SOCK_SERVER) Sponsored by: The FreeBSD Foundation Reviewed by: Christoph Mallon <christoph.mallon@gmx.de> Many aspects discussed with: rwatson, benl, jonathan ABI compatibility discussed with: kib
|
#
593efaf9 |
|
21-Feb-2013 |
John Baldwin <jhb@FreeBSD.org> |
Further refine the handling of stop signals in the NFS client. The changes in r246417 were incomplete as they did not add explicit calls to sigdeferstop() around all the places that previously passed SBDRY to _sleep(). In addition, nfs_getcacheblk() could trigger a write RPC from getblk() resulting in sigdeferstop() recursing. Rather than manually deferring stop signals in specific places, change the VFS_*() and VOP_*() methods to defer stop signals for filesystems which request this behavior via a new VFCF_SBDRY flag. Note that this has to be a VFC flag rather than a MNTK flag so that it works properly with VFS_MOUNT() when the mount is not yet fully constructed. For now, only the NFS clients are set this new flag in VFS_SET(). A few other related changes: - Add an assertion to ensure that TDF_SBDRY doesn't leak to userland. - When a lookup request uses VOP_READLINK() to follow a symlink, mark the request as being on behalf of the thread performing the lookup (cnp_thread) rather than using a NULL thread pointer. This causes NFS to properly handle signals during this VOP on an interruptible mount. PR: kern/176179 Reported by: Russell Cattelan (sigdeferstop() recursion) Reviewed by: kib MFC after: 1 month
|
#
8909f88d |
|
01-Dec-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Fix one more compilation issue.
|
#
499f0f4d |
|
30-Nov-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
IFp4 @208451: Fix path handling for *at() syscalls. Before the change directory descriptor was totally ignored, so the relative path argument was appended to current working directory path and not to the path provided by descriptor, thus wrong paths were stored in audit logs. Now that we use directory descriptor in vfs_lookup, move AUDIT_ARG_UPATH1() and AUDIT_ARG_UPATH2() calls to the place where we hold file descriptors table lock, so we are sure paths will be resolved according to the same directory in audit record and in actual operation. Sponsored by: FreeBSD Foundation (auditdistd) Reviewed by: rwatson MFC after: 2 weeks
|
#
f121e3e8 |
|
27-Nov-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Add NOCAPCHECK flag to namei that allows lookup to work even if the process is in capability mode. - Add VN_OPEN_NOCAPCHECK flag for vn_open_cred() to will ne converted into NOCAPCHECK namei flag. This functionality will be used to enable core dumps for sandboxed processes. Reviewed by: rwatson Obtained from: WHEEL Systems MFC after: 2 weeks
|
#
5050aa86 |
|
22-Oct-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
|
#
84c3cd4f |
|
09-Sep-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Add MNTK_LOOKUP_EXCL_DOTDOT struct mount flag, which specifies to the lookup code that dotdot lookups shall override any shared lock requests with the exclusive one. The flag is useful for filesystems which sometimes need to upgrade shared lock to exclusive inside the VOP_LOOKUP or later, which cannot be done safely for dotdot, due to dvp also locked and causing LOR. In collaboration with: pho MFC after: 3 weeks
|
#
cdb7a431 |
|
01-Jan-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Avoid double-unlock or double unreference for ndp->ni_dvp when the vnode dp lock upgrade right after the 'success' label fails. In collaboration with: pho MFC after: 1 week
|
#
e141be6f |
|
18-Oct-2011 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Revisit the capability failure trace points. The initial implementation only logged instances where an operation on a file descriptor required capabilities which the file descriptor did not have. By adding a type enum to struct ktr_cap_fail, we can catch other types of capability failures as well, such as disallowed system calls or attempts to wrap a file descriptor with more capabilities than it had to begin with.
|
#
69d377fe |
|
13-Aug-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Allow Capsicum capabilities to delegate constrained access to file system subtrees to sandboxed processes. - Use of absolute paths and '..' are limited in capability mode. - Use of absolute paths and '..' are limited when looking up relative to a capability. - When a name lookup is performed, identify what operation is to be performed (such as CAP_MKDIR) as well as check for CAP_LOOKUP. With these constraints, openat() and friends are now safe in capability mode, and can then be used by code such as the capability-mode runtime linker. Approved by: re (bz), mentor (rwatson) Sponsored by: Google Inc
|
#
a9d2f8d8 |
|
10-Aug-2011 |
Robert Watson <rwatson@FreeBSD.org> |
Second-to-last commit implementing Capsicum capabilities in the FreeBSD kernel for FreeBSD 9.0: Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op. Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions. In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit. Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent. Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
79856499 |
|
22-Aug-2010 |
Rui Paulo <rpaulo@FreeBSD.org> |
Add an extra comment to the SDT probes definition. This allows us to get use '-' in probe names, matching the probe names in Solaris.[1] Add userland SDT probes definitions to sys/sdt.h. Sponsored by: The FreeBSD Foundation Discussed with: rwaston [1]
|
#
3634d5b2 |
|
20-Aug-2010 |
John Baldwin <jhb@FreeBSD.org> |
Add dedicated routines to toggle lockmgr flags such as LK_NOSHARE and LK_CANRECURSE after a lock is created. Use them to implement macros that otherwise manipulated the flags directly. Assert that the associated lockmgr lock is exclusively locked by the current thread when manipulating these flags to ensure the flag updates are safe. This last change required some minor shuffling in a few filesystems to exclusively lock a brand new vnode slightly earlier. Reviewed by: kib MFC after: 3 days
|
#
1732ca8f |
|
31-May-2010 |
Robert Watson <rwatson@FreeBSD.org> |
Merge r203410 from head to stable/8: Only audit pathnames in namei(9) if copying the directory string completes successfully. Continue to do this before the empty path check so that the ENOENT returned in that case gets an empty string token in the BSM record. Approved by: re (kib)
|
#
246b6510 |
|
26-Mar-2010 |
Jaakko Heinonen <jh@FreeBSD.org> |
Support only LOOKUP operation for "/" in relookup() because lookup() can't succeed for CREATE, DELETE and RENAME. Discussed with: bde
|
#
b10c6cf4 |
|
02-Feb-2010 |
Robert Watson <rwatson@FreeBSD.org> |
Only audit pathnames in namei(9) if copying the directory string completes successfully. Continue to do this before the empty path check so that the ENOENT returned in that case gets an empty string token in the BSM record. MFC after: 3 days
|
#
5c303791 |
|
17-Nov-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r199137: Detect the slashdot lookup for RENAME or REMOVE in lookup(), and return EINVAL.
|
#
88e6f61a |
|
10-Nov-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
When rename("a", "b/.") is performed, target namei() call returns dvp == vp. Rename syscall does not check for the case, and at least ufs_rename() cannot deal with it. POSIX explicitely requires that both rename(2) and rmdir(2) return EINVAL when any of the pathes end in "/.". Detect the slashdot lookup for RENAME or REMOVE in lookup(), and return EINVAL. Reported by: Jim Meyering <jim meyering net> Tested by: simon, pho MFC after: 1 week
|
#
791b0ad2 |
|
29-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Eliminate ARG_UPATH[12] arguments to AUDIT_ARG_UPATH() and instead provide specific macros, AUDIT_ARG_UPATH1() and AUDIT_ARG_UPATH2() to capture path information for audit records. This allows us to move the definitions of ARG_* out of the public audit header file, as they are an implementation detail of our current kernel-internal audit record, which may change. Approved by: re (kensmith) Obtained from: TrustedBSD Project MFC after: 1 month
|
#
b146fc1b |
|
28-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Rework vnode argument auditing to follow the same structure, in order to avoid exposing ARG_ macros/flag values outside of the audit code in order to name which one of two possible vnodes will be audited for a system call. Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 month
|
#
e4b4bbb6 |
|
28-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Audit file descriptors passed to fooat(2) system calls, which are used instead of the root/current working directory as the starting point for lookups. Up to two such descriptors can be audited. Add audit record BSM encoding for fooat(2). Note: due to an error in the OpenBSM 1.1p1 configuration file, a further change is required to that file in order to fix openat(2) auditing. Approved by: re (kib) Reviewed by: rdivacky (fooat(2) portions) Obtained from: TrustedBSD Project MFC after: 1 month
|
#
14961ba7 |
|
27-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Replace AUDIT_ARG() with variable argument macros with a set more more specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis tools (such as fxr). In MFC, we should leave the existing AUDIT_ARG() macros as they may be used by third-party kernel modules. Suggested by: brooks Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 week
|
#
322ef7cc |
|
05-Jun-2009 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Eliminate trailing_slash, which was made redundant in r193028. Remove a couple of 4-year-old "temporary" KASSERTs. Improve comments. MFC after: 1 week
|
#
bcf11e8d |
|
05-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
|
#
32bf7cdf |
|
29-May-2009 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Let vfs_lookup() return ENOTDIR if the path has a trailing slash and the last component is a symlink to something that isn't a directory. We introduce a new namei flag, TRAILINGSLASH, which is set by lookup() if the last component is followed by a slash. The trailing slash is then stripped, as before. If the final component is a symlink, lookup() will return to namei(), which will expand the symlink and call lookup() with the new path. When all symlinks have been resolved, lookup() checks if the TRAILINGSLASH flag is set, and if it is, and the vnode it ended up with is not a directory, it returns ENOTDIR. PR: kern/21768 Submitted by: Eygene Ryabinkin <rea-fbsd@codelabs.ru> MFC after: 3 weeks
|
#
b181c8aa |
|
29-May-2009 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Fix misleading comment. MFC after: 1 week
|
#
0304c731 |
|
27-May-2009 |
Jamie Gritton <jamie@FreeBSD.org> |
Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)
|
#
22d7ae67 |
|
11-May-2009 |
Attilio Rao <attilio@FreeBSD.org> |
Fix a kernel compilation error, introduced after r191990, by defining thread with curthread in the AUDIT case. Reported by: dchagin
|
#
dfd233ed |
|
11-May-2009 |
Attilio Rao <attilio@FreeBSD.org> |
Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.
|
#
4b4e58ba |
|
06-Apr-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Add SDT DTrace probes for namei(): vfs:namei:lookup:entry takes parent directory vnode pointer, path to look up, and lookup flags. vfs:namei:lookup:return takes an error value, and if successful, the returned vnode pointer. MFC after: 1 month
|
#
049ce093 |
|
24-Mar-2009 |
John Baldwin <jhb@FreeBSD.org> |
When a file lookup fails due to encountering a doomed vnode from a forced unmount, consistently return ENOENT rather than EBADF. Reviewed by: kib MFC after: 1 month
|
#
a6b6eb6b |
|
11-Mar-2009 |
John Baldwin <jhb@FreeBSD.org> |
Gah, fix the code to match the comment. For non-open lookups use a shared vnode lock for the leaf vnode if LOCKSHARED is set. Submitted by: rdivacky
|
#
33fc3625 |
|
11-Mar-2009 |
John Baldwin <jhb@FreeBSD.org> |
Add a new internal mount flag (MNTK_EXTENDED_SHARED) to indicate that a filesystem supports additional operations using shared vnode locks. Currently this is used to enable shared locks for open() and close() of read-only file descriptors. - When an ISOPEN namei() request is performed with LOCKSHARED, use a shared vnode lock for the leaf vnode only if the mount point has the extended shared flag set. - Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but not O_CREAT. - Use a shared vnode lock around VOP_CLOSE() if the file was opened with O_RDONLY and the mountpoint has the extended shared flag set. - Adjust md(4) to upgrade the vnode lock on the vnode it gets back from vn_open() since it now may only have a shared vnode lock. - Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since FIFO's require exclusive vnode locks for their open() and close() routines. (My recent MPSAFE patches for UDF and cd9660 already included this change.) - Enable extended shared operations on UFS, cd9660, and UDF. Submitted by: ups Reviewed by: pjd (ZFS bits) MFC after: 1 month
|
#
2cfddad7 |
|
18-Dec-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not return success and doomed vnode from lookup. LK_UPGRADE allows the vnode to be reclaimed. Tested by: pho MFC after: 1 month
|
#
1ba4a712 |
|
17-Nov-2008 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes. This bring huge amount of changes, I'll enumerate only user-visible changes: - Delegated Administration Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc. - L2ARC Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content. - slog Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2). - vfs.zfs.super_owner Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one. - chflags(2) Not all the flags are supported. This still needs work. - ZFSBoot Support to boot off of ZFS pool. Not finished, AFAIK. Submitted by: dfr - Snapshot properties - New failure modes Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests - Refquota, refreservation properties Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots. - Sparse volumes ZVOLs that don't reserve space in the pool. - External attributes Compatible with extattr(2). - NFSv4-ACLs Not sure about the status, might not be complete yet. Submitted by: trasz - Creation-time properties - Regression tests for zpool(8) command. Obtained from: OpenSolaris
|
#
0f54f8c2 |
|
03-Nov-2008 |
John Baldwin <jhb@FreeBSD.org> |
A few style nits.
|
#
83b3bdbc |
|
02-Nov-2008 |
Attilio Rao <attilio@FreeBSD.org> |
Improve VFS locking: - Implement real draining for vfs consumers by not relying on the mnt_lock and using instead a refcount in order to keep track of lock requesters. - Due to the change above, remove the mnt_lock lockmgr because it is now useless. - Due to the change above, vfs_busy() is no more linked to a lockmgr. Change so its KPI by removing the interlock argument and defining 2 new flags for it: MBF_NOWAIT which basically replaces the LK_NOWAIT of the old version (which was unlinked from the lockmgr alredy) and MBF_MNTLSTLOCK which provides the ability to drop the mountlist_mtx once the mnt interlock is held (ability still desired by most consumers). - The stub used into vfs_mount_destroy(), that allows to override the mnt_ref if running for more than 3 seconds, make it totally useless. Remove it as it was thought to work into older versions. If a problem of "refcount held never going away" should appear, we will need to fix properly instead than trust on such hackish solution. - Fix a bug where returning (with an error) from dounmount() was still leaving the MNTK_MWAIT flag on even if it the waiters were actually woken up. Just a place in vfs_mount_destroy() is left because it is going to recycle the structure in any case, so it doesn't matter. - Remove the markercnt refcount as it is useless. This patch modifies VFS ABI and breaks KPI for vfs_busy() so manpages and __FreeBSD_version will be modified accordingly. Discussed with: kib Tested by: pho
|
#
d7f03759 |
|
19-Oct-2008 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Import the HEAD csup code which is the basis for the cvsmode work.
|
#
b957a822 |
|
01-Oct-2008 |
John Baldwin <jhb@FreeBSD.org> |
Enable shared locks for path name lookups on supported filesystems (NFS client, UFS, and ZFS) by default.
|
#
d59701d0 |
|
01-Oct-2008 |
John Baldwin <jhb@FreeBSD.org> |
Remove the LOOKUP_SHARED kernel option. Instead, make vfs.lookup_shared a loader tunable (it was already a sysctl).
|
#
59d49325 |
|
31-Aug-2008 |
Attilio Rao <attilio@FreeBSD.org> |
Decontextualize vfs_busy(), vfs_unbusy() and vfs_mount_alloc() functions. Manpages are updated accordingly. Tested by: Diego Sardina <siarodx at gmail dot com>
|
#
48b05c3f |
|
08-Apr-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement the linux syscalls openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat, renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat. Submitted by: rdivacky Sponsored by: Google Summer of Code 2007 Tested by: pho
|
#
57b4252e |
|
30-Mar-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Add the support for the AT_FDCWD and fd-relative name lookups to the namei(9). Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho
|
#
237fdd78 |
|
16-Mar-2008 |
Robert Watson <rwatson@FreeBSD.org> |
In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
|
#
81c794f9 |
|
25-Feb-2008 |
Attilio Rao <attilio@FreeBSD.org> |
Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>
|
#
628f51d2 |
|
24-Feb-2008 |
Attilio Rao <attilio@FreeBSD.org> |
Introduce some functions in the vnode locks namespace and in the ffs namespace in order to handle lockmgr fields in a controlled way instead than spreading all around bogus stubs: - VN_LOCK_AREC() allows lock recursion for a specified vnode - VN_LOCK_ASHARE() allows lock sharing for a specified vnode In FFS land: - BUF_AREC() allows lock recursion for a specified buffer lock - BUF_NOREC() disallows recursion for a specified buffer lock Side note: union_subr.c::unionfs_node_update() is the only other function directly handling lockmgr fields. As this is not simple to fix, it has been left behind as "sole" exception.
|
#
22db15c0 |
|
13-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
#
cb05b60a |
|
09-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
#
30d239bc |
|
24-Oct-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
|
#
b4d7e298 |
|
21-Sep-2007 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Fix some locking cases where we ask for exclusively locked vnode, but we get shared locked vnode in instead when vfs.lookup_shared is set to 1. Discussed with: kib, kris Tested by: kris Approved by: re (kensmith)
|
#
e1e8f51b |
|
27-May-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Universally adopt most conventional spelling of acquire.
|
#
5e3f7694 |
|
04-Apr-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff
|
#
e92d773f |
|
31-Mar-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Rather than ignoring any error return from getnewvnode() in nameiinit(), explicitly test and panic. This should not ever happen, but if it does, this is a preferred failure mode to a NULL pointer dereference in kernel. Coverity CID: 1716 Found with: Coverity Prevent(tm)
|
#
478a8db4 |
|
15-Feb-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
If both ISDOTDOT and NOCROSSMOUNT are set then lookup() might breaks out of the special handling for ".." and perform an ISDOTDOT VOP_LOOKUP() for a filesystem root vnode. Handle this case inside lookup(). Submitted by: tegge PR: 92785 MFC after: 1 week
|
#
7f92c4ee |
|
22-Jan-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
Below is slightly edited description of the LOR by Tor Egge: -------------------------- [Deadlock] is caused by a lock order reversal in vfs_lookup(), where [some] process is trying to lock a directory vnode, that is the parent directory of covered vnode) while holding an exclusive vnode lock on covering vnode. A simplified scenario: root fs var fs / A / (/var) D /var B /log (/var/log) E vfs lock C vfs lock F Within each file system, the lock order is clear: C->A->B and F->D->E When traversing across mounts, the system can choose between two lock orders, but everything must then follow that lock order: L1: C->A->B | +->F->D->E L2: F->D->E | +->C->A->B The lookup() process for namei("/var") mixes those two lock orders: VOP_LOOKUP() obtains B while A is held vfs_busy() obtains a shared lock on F while A and B are held (follows L1, violates L2) vput() releases lock on B VOP_UNLOCK() releases lock on A VFS_ROOT() obtains lock on D while shared lock on F is held vfs_unbusy() releases shared lock on F vn_lock() obtains lock on A while D is held (violates L1, follows L2) dounmount() follows L1 (B is locked while F is drained). Without unmount activity, vfs_busy() will always succeed without blocking and the deadlock isn't triggered (the system behaves as if L2 is followed). With unmount, you can get 4 processes in a deadlock: p1: holds D, want A (in lookup()) p2: holds shared lock on F, want D (in VFS_ROOT()) p3: holds B, want drain lock on F (in dounmount()) p4: holds A, want B (in VOP_LOOKUP()) You can have more than one instance of p2. The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode. - Tor Egge To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp is actually not used by the callers of namei. Thus, placeholder deadfs vnode vp_crossmp is introduced that is filled into ni_dvp. Idea by: ups Reviewed by: tegge, ups, jeff, rwatson (mac interaction) Tested by: Peter Holm MFC after: 2 weeks
|
#
aed55708 |
|
22-Oct-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA
|
#
3c5b80d6 |
|
14-Sep-2006 |
Mohan Srinivasan <mohans@FreeBSD.org> |
Fix for a potential bug caught by Coverity. Pointed out to me by Kris Kennaway.
|
#
7d7d9e22 |
|
13-Sep-2006 |
Mohan Srinivasan <mohans@FreeBSD.org> |
Fixes up the handling of shared vnode lock lookups in the NFS client, adds a FS type specific flag indicating that the FS supports shared vnode lock lookups, adds some logic in vfs_lookup.c to test this flag and set lock flags appropriately. - amd on 6.x is a non-starter (without this change). Using amd under heavy load results in a deadlock (with cascading vnode locks all the way to the root) very quickly. - This change should also fix the more general problem of cascading vnode deadlocks when an NFS server goes down. Ideally, we wouldn't need these changes, as enabling shared vnode lock lookups globally would work. Unfortunately, UFS, for example isn't ready for shared vnode lock lookups, crashing pretty quickly. This change is the result of discussions with Stephan Uphoff (ups@). Reviewed by: ups@
|
#
5111b5e1 |
|
05-Aug-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Remove register, use ANSI function headers.
|
#
12de4510 |
|
05-Aug-2006 |
Robert Watson <rwatson@FreeBSD.org> |
We now spell "inode" as "vnode" in the VFS layer, so update comment for new world order. MFC after: 3 days Pointed out by: mckusick
|
#
cef31ff7 |
|
29-Apr-2006 |
Kris Kennaway <kris@FreeBSD.org> |
Lock giant when assigning ni_vp and keep vfslocked state valid. Committed for: jeff
|
#
4b5b8681 |
|
27-Apr-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- Consistently track ni_dvp and ni_vp with dvfslocked and vfslocked rather than trying to optimize it into a single lock. This adds more calls to lock giant with non smpsafe filesystems but is the only way to reliably hold the correct lock. - Remove an invalid assert in the mountedhere case in lookup and fix the code to properly deal with the scenario. We can actually have a lookup that returns dp == dvp with mountedhere set with certain unmount races. Tested by: kris Reported by: kris/mohans
|
#
fdf86b2d |
|
30-Mar-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- LK_RETRY means nothing when passed to VOP_LOCK. Call vn_lock instead. - Move the vn_lock of the dvp until after we've unbusied the filesystem to avoid a LOR with the mount point lock. - In the v_mountedhere while loop we acquire a new instance of giant each time through without releasing the first. This would cause us to leak Giant. Sponsored by: Isilon Systems, Inc.
|
#
2f0bca55 |
|
06-Feb-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- Don't check v_mount for NULL to determine if a vnode has been recycled. Use the more appropriate VI_DOOMED flag instead. Sponsored by: Isilon Systems, Inc. MFC After: 1 week
|
#
95fea57c |
|
05-Feb-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Add AUDITVNODE[12] flags to namei(), which cause namei() to audit path and vnode attribute information for looked up vnodes during the lookup operation. This will allow consumers of namei() to specify that this information be added to the in-process audit record. Submitted by: wsalamon Obtained from: TrustedBSD Project
|
#
9157b485 |
|
01-Feb-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- Solve a problem where a vput could be called on an outgoing directory without Giant held. Do this by tracking the vfslocked state for the directory seperate from the child. This is only important in the case where we cross a mountpoint. Sponsored by: Isilon Systems, Inc. MFC After: 3 days
|
#
1dd5fc0f |
|
22-Jan-2006 |
Don Lewis <truckman@FreeBSD.org> |
Tweak previous vfs_lookup.c commit to return an EINVAL error from lookup() instead of EPERM when a DELETE or RENAME operation is attempted on "..". In kern_unlink(), remap EINVAL errors returned from namei() to EPERM to match existing (and POSIX required) behaviour. Discussed with: bde MFC after: 3 days
|
#
bea7a8d7 |
|
21-Jan-2006 |
Don Lewis <truckman@FreeBSD.org> |
Return EPERM from lookup() if cn_nameiop is DELETE or RENAME and the last component of the path name is "..". This keeps VOP_LOOKUP() from locking vnodes in reverse order. Tested by: Denis Shaposhnikov <dsh AT vlink DOT ru> MFC after: 3 days
|
#
e12560dd |
|
21-Sep-2005 |
John Baldwin <jhb@FreeBSD.org> |
Use correct VFS locking rather than unconditionally grabbing Giant around namei() calls in kern_alternate_path(). Reviewed by: csjp MFC after: 1 week
|
#
68ff2a43 |
|
15-Sep-2005 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Improve the MP safeness associated with the creation of symbolic links and the execution of ELF binaries. Two problems were found: 1) The link path wasn't tagged as being MP safe and thus was not properly protected. 2) The ELF interpreter vnode wasnt being locked in namei(9) and thus was insufficiently protected. This commit makes the following changes: -Sets the MPSAFE flag in NDINIT for symbolic link paths -Sets the MPSAFE flag in NDINIT and introduce a vfslocked variable which will be used to instruct VFS_UNLOCK_GIANT to unlock Giant if it has been picked up. -Drop in an assertion into vfs_lookup which ensures that if the MPSAFE flag is NOT set, that we have picked up giant. If not panic (if WITNESS compiled into the kernel). This should help us find conditions where vnode operations are in-sufficiently protected. This is a RELENG_6 candidate. Discussed with: jeff MFC after: 4 days
|
#
0c207975 |
|
14-Aug-2005 |
Alexander Kabaev <kan@FreeBSD.org> |
Do not keep parent directory locked while calling VFS_ROOT to traverse mount points in lookup(). The lock can be dropped safely around VFS_ROOT because LOCKPARENT semantics with child and perent vnodes coming from different FSes does not really have any meaningful use. On the other hard, this prevents easily triggered deadlock on systems using automounter daemon.
|
#
74a51232 |
|
13-Apr-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Remove a debugging printf that slipped in. Spotted by: Peter Wemm
|
#
18ef8344 |
|
13-Apr-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Further simplify lookup; Force all filesystems to relock in the DOTDOT case. There are bugs in some which didn't unlock in the ISDOTDOT case to begin with that need to be addressed seperately. This simplifies things anyway. - Fix relookup() to prevent it from vrele()'ing the dvp while the vp is locked. Catch up to other lookup changes. Sponsored by: Isilon Systems, Inc. Reported by: Peter Wemm
|
#
d3b78f73 |
|
09-Apr-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- If we vrele() a dvp while the child is locked we can potentially deadlock when vrele() acquires the directory lock in the wrong order. Fix this via the following changes: - Keep the directory locked after VOP_LOOKUP() until we've determined what we're going to do with the child. This allows us to remove the complicated post LOOKUP code which determins whether we should lock or unlock the parent. This means we may have to vput() in the appropriate cases later, rather than doing an unsafe vrele. - in NDFREE() keep two flags to indicate whether we need to unlock vp or dvp. This allows us to vput rather than vrele in the appropriate cases without rechecking the flags. Move the code to handle dvp after we handle vp. - Remove some dead code from namei() that was the result of changes to VFS_LOCK_GIANT(). Sponsored by: Isilon Systems, Inc.
|
#
2bbd6c98 |
|
05-Apr-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Move NDFREE() from vfs_subr to vfs_lookup where namei() is.
|
#
9a6bb8ad |
|
03-Apr-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Include opt_vfs.h for LOOKUP_SHARED. - Control the behavior of shared lookups with the lookup_shared sysctl which has its default behavior set via the LOOKUP_SHARED option.
|
#
99f3c870 |
|
29-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Set cn_lkflags to LK_SHARED in the LOOKUP_SHARED case so that we only acquire shared locks on intermediate directories. - For the LASTCN, we may have to LK_UPGRADE the parent directory before we lookup the last component. - Acquire VFS_ROOT and dp locks based on the cn_lkflag. Sponsored by: Isilon Systems, Inc.
|
#
ea9aa09d |
|
28-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Remove an unused variable from relookup(). - Assert that REMOVE, CREATE, and RENAME callers have WANTPARENT or LOCKPARENT set. You can't complete any of these operations without at least a reference to the parent. Many filesystems check for this case even though it isn't possible in the current system.
|
#
1e38e08e |
|
28-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Get rid of PDIRUNLOCK, instead, we fixup the lock state immediately after calling VOP_LOOKUP(). Rather than having each filesystem check the LOCKPARENT flag, we simply check it once here and unlock as required. The only unusual case is ISDOTDOT, where we require an unlocked vnode on return. Relocking this vnode with the child locked is allowed since the child is actually its parent. - Add a few asserts for some unusual conditions that I do not believe can happen. These will later go away and turn into implementations for these conditions. Sponsored by: Isilon Systems, Inc.
|
#
d830f828 |
|
24-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Pass LK_EXCLUSIVE to VFS_ROOT() to satisfy the new flags argument. For now, all calls to VFS_ROOT() should still acquire exclusive locks. Sponsored by: Isilon Systems, Inc.
|
#
ad09e57f |
|
23-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Clear LOCKSHARED if LOOKUP_SHARED is not enabled. This is not strictly necessary since we disable the shared locks in vfs_cache, but it is prefered that the option not leak out into filesystems when it is disabled. Sponsored by: Isilon Systems, Inc.
|
#
76951d21 |
|
07-Feb-2005 |
John Baldwin <jhb@FreeBSD.org> |
- Tweak kern_msgctl() to return a copy of the requested message queue id structure in the struct pointed to by the 3rd argument for IPC_STAT and get rid of the 4th argument. The old way returned a pointer into the kernel array that the calling function would then access afterwards without holding the appropriate locks and doing non-lock-safe things like copyout() with the data anyways. This change removes that unsafeness and resulting race conditions as well as simplifying the interface. - Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(), fstatfs(), and fhstatfs(). Use these wrappers to cut out a lot of code duplication for freebsd4 and netbsd compatability system calls. - Add a new lookup function kern_alternate_path() that looks up a filename under an alternate prefix and determines which filename should be used. This is basically a more general version of linux_emul_convpath() that can be shared by all the ABIs thus allowing for further reduction of code duplication.
|
#
dcff5b14 |
|
24-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't call VOP_CREATEVOBJECT(), it's the responsibility of the filesystem which owns the vnode.
|
#
22a960a6 |
|
24-Jan-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Acquire and release Giant as we enter and leave filesystems which require it. - Track the status of Giant with the nd flag HASGIANT. - Release giant on return of namei() callers are not marked MPSAFE as they already own giant. Sponsored By: Isilon Systems, Inc.
|
#
e39db32a |
|
12-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Ditch vfs_object_create() and make the callers call VOP_CREATEVOBJECT() directly.
|
#
9454b2d8 |
|
06-Jan-2005 |
Warner Losh <imp@FreeBSD.org> |
/* -> /*- for copyright notices, minor format tweaks as necessary
|
#
082d2122 |
|
02-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make NAMEI_DIAGNOSTIC compile again and add a stragic vprint()
|
#
7a36e1d6 |
|
04-Aug-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Assert Giant in namei(). Bugs have been reported in which, following a sleep() call waking up in namei(), a later assertion triggers that Giant is not held. By asserting Giant at the start of namei(), we can know that if that assertion triggers, Giant is lost during the call to namei(), and not before.
|
#
f257b7a5 |
|
12-Jul-2004 |
Alfred Perlstein <alfred@FreeBSD.org> |
Make VFS_ROOT() and vflush() take a thread argument. This is to allow filesystems to decide based on the passed thread which vnode to return. Several filesystems used curthread, they now use the passed thread.
|
#
7f8a436f |
|
05-Apr-2004 |
Warner Losh <imp@FreeBSD.org> |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core
|
#
677b542e |
|
10-Jun-2003 |
David E. O'Brien <obrien@FreeBSD.org> |
Use __FBSDID().
|
#
a163d034 |
|
18-Feb-2003 |
Warner Losh <imp@FreeBSD.org> |
Back out M_* changes, per decision of the TRB. Approved by: trb
|
#
44956c98 |
|
21-Jan-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
#
b614dd13 |
|
19-Oct-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Add a new 'NOMACCHECK' flag to namei() NDINIT flags, which permits the caller to indicate that MAC checks are not required for the lookup. Similar to IO_NOMACCHECK for vn_rdwr(), this indicates that the caller has already performed all required protections and that this is an internally generated operation. This will be used by the NFS server code, as we don't currently enforce MAC protections against requests delivered via NFS. While here, add NOCROSSMOUNT to PARAMASK; apparently this was used at one point for name lookup flag checking, but isn't any longer or it would have triggered from the NFS server code passing it to indicate that mountpoints shouldn't be crossed in lookups. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
e6e370a7 |
|
04-Aug-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
|
#
d03db429 |
|
31-Jul-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Introduce support for Mandatory Access Control and extensible kernel access control. Authorize vop_readlink() and vop_lookup() activities during recursive path lookup via namei() via calls to appropriate MAC entry points. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
eeb92518 |
|
24-Jul-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Under #ifdef DIAGNOSTIC, NULL out componentname pointers if we free the pnbuf to increase the chances of detecting use of a free'd name buffer if SAVENAME or SAVESTART wasn't passed in. Curiously, running with these changes doesn't panic the kernel, and should.
|
#
60a9bb19 |
|
06-Jun-2002 |
John Baldwin <jhb@FreeBSD.org> |
Catch up to changes in ktrace API.
|
#
d394511d |
|
16-May-2002 |
Tom Rhodes <trhodes@FreeBSD.org> |
More s/file system/filesystem/g
|
#
c897b813 |
|
19-Mar-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove references to vm_zone.h and switch over to the new uma API. Also, remove maxsockets. If you look carefully you'll notice that the old zone allocator never honored this anyway.
|
#
8355f576 |
|
19-Mar-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@
|
#
bdd67d48 |
|
27-Feb-2002 |
John Baldwin <jhb@FreeBSD.org> |
- Change namei() to use td_ucred instead of p_ucred. - Change the hack in access() that uses a temporary credential to set td_ucred to the temp cred instead of p_ucred.
|
#
9e209b12 |
|
13-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Include sys/_lock.h and sys/_mutex.h to reduce namespace pollution. Requested by: jhb
|
#
426da3bc |
|
13-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
SMP Lock struct file, filedesc and the global file list. Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file *fp); /* increments reference count on a file */ struct file * fhold_locked(struct file *fp); /* like fhold but expects file to locked */ struct file * ffind_hold(struct thread *, int fd); /* finds the struct file in thread, adds one reference and returns it unlocked */ struct file * ffind_lock(struct thread *, int fd); /* ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.
|
#
b40ce416 |
|
12-Sep-2001 |
Julian Elischer <julian@FreeBSD.org> |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
|
#
c7503f60 |
|
23-Jun-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
After exhaustive discussions and some meandering and confusion, enough people are on track with the cause and effect of this, and although fixing this severely degenerate case appears to violate the letter of POSIX.1-200x, Bruce and I (and enough others) agree that it should be comitted. So, this patch generates an ENOENT error for any attempt to do a path lookup through an empty symlink (e.g. open(), stat()). Submitted by: "Andrey A. Chernov" <ache@nagual.pp.ru> Reviewed by: bde Discussed exhaustively on: freebsd-current Previously committed to: NetBSD 4 years ago
|
#
fb919e4d |
|
01-May-2001 |
Mark Murray <markm@FreeBSD.org> |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
|
#
60fb0ce3 |
|
28-Apr-2001 |
Greg Lehey <grog@FreeBSD.org> |
Revert consequences of changes to mount.h, part 2. Requested by: bde
|
#
d98dc34f |
|
23-Apr-2001 |
Greg Lehey <grog@FreeBSD.org> |
Correct #includes to work with fixed sys/mount.h.
|
#
138e514c |
|
06-Dec-2000 |
Peter Wemm <peter@FreeBSD.org> |
Untangle vfsinit() a bit. Use seperate sysinit functions rather than having a super-function calling bits all over the place.
|
#
23771027 |
|
30-Nov-2000 |
Alfred Perlstein <alfred@FreeBSD.org> |
This is a fix for a problem described in PR kern/19572. It was recently discussed at -hackers. The problem is a null-pointer dereference that happens in kern/vfs_lookup.c when accessing ".." with a v_mount entry for the current directory vnode of NULL. This happens when a volume is forcibly unmounted, and the vnode for a working directory in the mounted volume is cleared. PR: 23191 Submitted by: Thomas Moestl <tmoestl@gmx.net>
|
#
3ff1a2f4 |
|
17-Sep-2000 |
Boris Popov <bp@FreeBSD.org> |
Add new flag PDIRUNLOCK to the component.cn_flags which should be set by filesystem lookup() routine if it unlocks parent directory. This flag should be carefully tracked by filesystems if they want to work properly with nullfs and other stacked filesystems. VFS takes advantage of this flag to perform symantically correct usage of vrele() instead of vput() if parent directory already unlocked. If filesystem fails to track this flag then previous codepath in VFS left unchanged. Convert UFS code to set PDIRUNLOCK flag if necessary. Other filesystmes will be changed after some period of testing. Reviewed in general by: mckusick, dillon, adrian Obtained from: NetBSD
|
#
64134168 |
|
13-Sep-2000 |
Boris Popov <bp@FreeBSD.org> |
Unlock current directory when calling VFS_ROOT() because underlying filesystem may hold the lock. Otherwise unavoidable deadlock will occur. This shouldn't have any side effects as long as we hold vfs lock. Obtained from: NetBSD
|
#
762e6b85 |
|
15-Dec-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
Introduce NDFREE (and remove VOP_ABORTOP)
|
#
3b6fb885 |
|
02-Oct-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Before we start to mess with the VFS name-cache clean things up a little bit: Isolate the namecache in its own file, and give it a dedicated malloc type.
|
#
2fe5bd8b |
|
25-Sep-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix a hole in jail(2). Noticed by: Alexander Bezroutchko <abb@zenon.net>
|
#
c3aac50f |
|
27-Aug-1999 |
Peter Wemm <peter@FreeBSD.org> |
$Id$ -> $FreeBSD$
|
#
67452993 |
|
26-Jul-1999 |
Alan Cox <alc@FreeBSD.org> |
Add sysctl and support code to allow directories to be VMIO'd. The default setting for the sysctl is OFF, which is the historical operation. Submitted by: dillon
|
#
8aef1712 |
|
27-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
#
d254af07 |
|
27-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
#
219cbf59 |
|
09-Jan-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
KNFize, by bde.
|
#
5526d2d9 |
|
08-Jan-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as discussed on -hackers. Introduce 'KASSERT(assertion, ("panic message", args))' for simple check + panic. Reviewed by: msmith
|
#
fb116777 |
|
05-Jan-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
Remove the 'waslocked' parameter to vfs_object_create().
|
#
ecbb00a2 |
|
07-Jun-1998 |
Doug Rabson <dfr@FreeBSD.org> |
This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change. The prototype FreeBSD/alpha machdep will follow in a couple of days time.
|
#
5ddc8ded |
|
08-Apr-1998 |
Wolfram Schneider <wosch@FreeBSD.org> |
New mount option nosymfollow. If enabled, the kernel lookup() function will not follow symbolic links on the mounted file system and return EACCES (Permission denied).
|
#
9f24f214 |
|
14-Feb-1998 |
John Dyson <dyson@FreeBSD.org> |
Make the rootdir handling more consistent. Now, processes always have a root vnode associated with them, and no special checks for the null case are needed. Submitted by: terry@freebsd.org
|
#
0b08f5f7 |
|
05-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Back out DIAGNOSTIC changes.
|
#
47cfdb16 |
|
04-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Turn DIAGNOSTIC into a new-style option.
|
#
95e5e988 |
|
05-Jan-1998 |
John Dyson <dyson@FreeBSD.org> |
Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does. When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex. When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore. A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes. Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.
|
#
2be70f79 |
|
28-Dec-1997 |
John Dyson <dyson@FreeBSD.org> |
Lots of improvements, including restructring the caching and management of vnodes and objects. There are some metadata performance improvements that come along with this. There are also a few prototypes added when the need is noticed. Changes include: 1) Cleaning up vref, vget. 2) Removal of the object cache. 3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore. 4) Correct some missing LK_RETRY's in vn_lock. 5) Correct the page range in the code for msync. Be gentle, and please give me feedback asap.
|
#
675ea6f0 |
|
26-Dec-1997 |
Bruce Evans <bde@FreeBSD.org> |
Unspammed nested include of <vm/vm_zone.h>.
|
#
99448ed1 |
|
20-Sep-1997 |
John Dyson <dyson@FreeBSD.org> |
Change the M_NAMEI allocations to use the zone allocator. This change plus the previous changes to use the zone allocator decrease the useage of malloc by half. The Zone allocator will be upgradeable to be able to use per CPU-pools, and has more intelligent usage of SPLs. Additionally, it has reasonable stats gathering capabilities, while making most calls inline.
|
#
e4ba6a82 |
|
02-Sep-1997 |
Bruce Evans <bde@FreeBSD.org> |
Removed unused #includes.
|
#
42146e37 |
|
04-Apr-1997 |
Doug Rabson <dfr@FreeBSD.org> |
[Previous comment was incorrect for these files] Added calls to VFS lock debugging macros to make fixing filesystems' locking easier.
|
#
de15ef6a |
|
04-Apr-1997 |
Doug Rabson <dfr@FreeBSD.org> |
Add a function vop_sharedlock which a copy of vop_nolock without the implementation #ifdef out. This can be used for now by NFS. As soon as all the other filesystems' locking is fixed, this can go away. Print the vnode address in vprint for easier debugging.
|
#
6875d254 |
|
22-Feb-1997 |
Peter Wemm <peter@FreeBSD.org> |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
78fd7b3d |
|
17-Feb-1997 |
Bruce Evans <bde@FreeBSD.org> |
Fixed namei caching for LOOKUPs. It was broken for lstat() and olstat(). Successful lstat()s purged an existing entry as well as not caching the result. This bug was introduced in Lite1 by setting the LOCKPARENT flag for [o]lstat() in order to support the inherit-attributes-from-parent- directory misfeature for symlinks. LOCKPARENT was previously only set for CREATEs and DELETEs. It is now set for LOOKUPs, but only for [o]lstat(), so the problem wasn't very noticeable.
|
#
996c772f |
|
09-Feb-1997 |
John Dyson <dyson@FreeBSD.org> |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
1130b656 |
|
14-Jan-1997 |
Jordan K. Hubbard <jkh@FreeBSD.org> |
Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
4958bbd1 |
|
01-Dec-1996 |
Bruce Evans <bde@FreeBSD.org> |
Don't allow empty pathnames. POSIX standard. Most of the standard utilities that depended on (or were broken in a different way by) the old behaviour of interpreting "" as "." were fixed a year or two ago. There is still a fairly harmless bug in tar and a harmless bug in gzip. Tar apparently replaces "/" by "" when it strips leading slashes.
|
#
edbfedac |
|
11-Mar-1996 |
Peter Wemm <peter@FreeBSD.org> |
Import 4.4BSD-Lite2 onto the vendor branch, note that in the kernel, all files are off the vendor branch, so this should not change anything. A "U" marker generally means that the file was not changed in between the 4.4Lite and Lite-2 releases, and does not need a merge. "C" generally means that there was a change. [note new unused (in this form) syscalls.conf, to be 'cvs rm'ed]
|
#
db6a20e2 |
|
03-Jan-1996 |
Garrett Wollman <wollman@FreeBSD.org> |
Converted two options over to the new scheme: USER_LDT and KTRACE.
|
#
d68a4190 |
|
22-Oct-1995 |
David Greenman <dg@FreeBSD.org> |
Moved the filesystem read-only check out of the syscalls and into the filesystem layer, as was done in lite-2. Merged in some other cosmetic changes while I was at it. Rewrote most of msdosfs_access() to be more like ufs_access() and to include the FS read-only check. Obtained from: partially from 4.4BSD-lite2
|
#
27df9774 |
|
24-Aug-1995 |
Doug Rabson <dfr@FreeBSD.org> |
Add support for amd direct maps. Reviewed by: Thomas Graichen <graichen@sirius.physik.fu-berlin.de>
|
#
70eec742 |
|
30-Jul-1995 |
Bruce Evans <bde@FreeBSD.org> |
Ignore trailing slashes in pathnames that "refer to a directory", as is required to be POSIXLY_CORRECT and "right". I interpret "referring to a directory" as being a directory or becoming a directory. E.g., the trailing slashes in mkdir("/nonesuch/"), rename("/tmp", /nonesuch/") and link("/tmp", "/root_can_like_dirs/") are ignored because the target will become a directory if the syscall succeeds. A trailing slash on a symlink causes the symlink to be followed (this is a bug if the symlink doesn't point to a directory; fix later).
|
#
9b2e5354 |
|
30-May-1995 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
Remove trailing whitespace.
|
#
82478919 |
|
06-Oct-1994 |
David Greenman <dg@FreeBSD.org> |
Use tsleep() rather than sleep so that 'ps' is more informative about the wait.
|
#
38103198 |
|
27-Sep-1994 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Moved the "relookup" routine into vfs_lookup.c from ufs/ufs/ufs_vnops.c. Several FS's use this, so it doesn't belong in ufs. (unionfs, msdosfs and ufs)
|
#
7b42c960 |
|
19-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
1) cleaned up after Garrett - fixed more redundant declarations, changed use of timeout_t -> timeout_func_t in aha1542 and aha1742 drivers. 2) fix a bug in the portalfs that was uncovered by better prototyping - specifically, the time must be converted from timeval to timespec before storing in va_atime. 3) fixed/added some miscellaneous prototypes
|
#
f23b4c91 |
|
18-Aug-1994 |
Garrett Wollman <wollman@FreeBSD.org> |
Fix up some sloppy coding practices: - Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above. NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
|
#
3c4dd356 |
|
02-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Added $Id$
|
#
df8bae1d |
|
24-May-1994 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
BSD 4.4 Lite Kernel Sources
|