#
0cd9cde7 |
|
06-Apr-2024 |
Jake Freeland <jfree@FreeBSD.org> |
ktrace: Record namei violations with KTR_CAPFAIL Report namei path lookups while Capsicum violation tracing with CAPFAIL_NAMEI. vfs caching is also ignored when tracing to mimic capability mode behavior. Reviewed by: markj Approved by: markj (mentor) MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D40680
|
#
05296a0f |
|
06-Apr-2024 |
Jake Freeland <jfree@FreeBSD.org> |
ktrace: Record syscall violations with KTR_CAPFAIL Report syscalls that are not allowed in capability mode with CAPFAIL_SYSCALL. Reviewed by: markj Approved by: markj (mentor) MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D40678
|
#
d0efabdf |
|
19-Mar-2024 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls.master: make __sys_fcntl take an intptr_t The (optional) third argument of fcntl is sometimes a pointer so change the type to intptr_t. Update the libc-internal defintion (actually used by libthr) to take a fixed intptr_t argument rather than pretending it's a variadic function. (That worked because all supported architectures pass variadic arguments as though the function was declared with those types. In CheriBSD that changes because variadic arguments are passed via a bounded array.) Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44381
|
#
f04220c1 |
|
19-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
kcmp(2): implement for vnode files Reviewed by: brooks, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43518
|
#
58d31716 |
|
22-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
Add fget_remote() The function holds and returns struct file for a file descriptor index in the given process. Reviewed by: brooks, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43518
|
#
55edc40e |
|
04-Jan-2024 |
Mark Johnston <markj@FreeBSD.org> |
file: Remove the fd parameter to fgetvp_lookup() and fgetvp_lookup_smr() The fd is always obtained from nameidata, so just fetch it from there instead. No functional change intended. Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D43257
|
#
29363fb4 |
|
23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags. Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
|
#
56bb3ce0 |
|
25-Sep-2023 |
Olivier Certner <olce.freebsd@certner.fr> |
pdinit(): Fix comment Reviewed by: markj, kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D42256
|
#
2af5ce5b |
|
09-Oct-2023 |
Zhenlei Huang <zlei@FreeBSD.org> |
fd: Add sysctl flag CTLFLAG_TUN to loader tunables The following sysctl variables are actually loader tunables. Add sysctl flag CTLFLAG_TUN to them so that `sysctl -T` will report them correctly. 1. kern.maxfiles 2. kern.maxfilesperproc No functional change intended. Reviewed by: kib, imp MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D42113
|
#
af93fea7 |
|
23-Aug-2023 |
Jake Freeland <jfree@freebsd.org> |
timerfd: Move implementation from linux compat to sys/kern Move the timerfd impelemntation from linux compat code to sys/kern. Use it to implement the new system calls for timerfd. Add a hook to kern_tc to allow timerfd to know when the system time has stepped. Add kqueue support to timerfd. Adjust a few names to be less Linux centric. RelNotes: YES Reviewed by: markj (on irc), imp, kib (with reservations), jhb (slack) Differential Revision: https://reviews.freebsd.org/D38459
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
6c049996 |
|
09-Jul-2023 |
Alan Somers <asomers@FreeBSD.org> |
During F_SETFL, don't change file flags on error Previously, even if the FIONBIO or FIOASYNC ioctl failed, the file's f_flags variable would still be changed. Now, kern_fcntl will restore the original flags if the ioctl fails. PR: 265736 Reported by: Yuval Pavel Zholkover <paulzhol@gmail.com> MFC after: 2 weeks Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D40955
|
#
3d2fec7d |
|
29-May-2023 |
Dmitry Chagin <dchagin@FreeBSD.org> |
namei: Add the abilty for the ABI to specify an alternate root path For now a non-native ABI (i.e., Linux) uses the kern_alternate_path() facility to dynamically reroot lookups. First, an attempt is made to lookup the file in /compat/linux/original-path. If that fails, the lookup is done in /original-path. Thats requires a bit of code in every ABI syscall implementation where path name translation is needed. Also our kern_alternate_path() does not properly lookups absolute symlinks in second attempt, i.e., does not append /compat/linux part to the resolved link. The change is intended to avoid this by specifiyng the ABI root directory for namei(), using one call to pwd_altroot() during exec-time into the ABI. In that case namei() will dynamically reroot lookups as mentioned above. PR: 72920 Reviewed by: kib Differential revision: https://reviews.freebsd.org/D38933 MFC after: 2 month
|
#
37b9fb16 |
|
28-Dec-2022 |
Konstantin Belousov <kib@FreeBSD.org> |
Add descrip_check_write_mp() helper ... which verifies that given file table does not have file descriptors referencing vnodes on the specified mount point. It is up to the caller to ensure that the check is not racy. Reviewed by: mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37896
|
#
d07675a9 |
|
04-Aug-2022 |
Mark Johnston <markj@FreeBSD.org> |
file: Move code to share fdtol structs into kern_descrip.c This ensures the filedesc-to-leader code is consistently encapsulated in kern_descrip.c. No functional change intended. Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D35988
|
#
c84c5e00 |
|
18-Jul-2022 |
Mitchell Horne <mhorne@FreeBSD.org> |
ddb: annotate some commands with DB_CMD_MEMSAFE This is not completely exhaustive, but covers a large majority of commands in the tree. Reviewed by: markj Sponsored by: Juniper Networks, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D35583
|
#
362ff986 |
|
13-Apr-2022 |
Konstantin Belousov <kib@FreeBSD.org> |
Revert rest of a5970a529c2d95271: use vrefact() when working on fp->f_vnode Now, since O_PATH-opened file descriptors use use references instead of the hold references, vrefact() chahges from that revision can be reverted. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34906
|
#
bf13db08 |
|
12-Apr-2022 |
Konstantin Belousov <kib@FreeBSD.org> |
Mostly revert a5970a529c2d95271: Make files opened with O_PATH to not block non-forced unmount Problem is that open(O_PATH) on nullfs -o nocache is broken then, because there is no reference on the vnode after the open syscall exits. Reported and tested by: ambrisko Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
b7262756 |
|
02-Apr-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fixup WANTIOCTLCAPS on open In some cases vn_open_cred overwrites cn_flags, effectively nullifying initialisation done in NDINIT. This will have to be fixed. In the meantime make sure the flag is passed. Reported by: jenkins Noted by: Mathieu <sigsys@gmail.com>
|
#
0c805718 |
|
24-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: fix memory leak on lookup with fds with ioctl caps Reviewed by: markj PR: 262515 Noted by: firk@cantconnect.ru Differential Revision: https://reviews.freebsd.org/D34667
|
#
fc7e121d |
|
16-Mar-2022 |
Mark Johnston <markj@FreeBSD.org> |
file: Move FILEDESC_FOREACH macros to kern_descrip.c They are only used in kern_descrip.c, so make them private. No functional change intended. Discussed with: mjg Sponsored by: The FreeBSD Foundation
|
#
c7022422 |
|
16-Mar-2022 |
Mark Johnston <markj@FreeBSD.org> |
file: Avoid a read-after-free of fd tables in sysctl handlers Some loops access the fd table of a different process, and drop the filedesc lock while iterating, so they check the table's refcount. However, we access the table before the first iteration, in order to get the number of table entries, and this access can be a use-after-free. Fix the problem by checking the refcount before we start iterating. Reported by: pho Reviewed by: mjg MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34575
|
#
f3f3e3c4 |
|
03-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: add close_range(..., CLOSE_RANGE_CLOEXEC) For compatibility with Linux. MFC after: 3 days Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D34424
|
#
f17ef286 |
|
22-Feb-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: rename fget*_locked to fget*_noref This gets rid of the error prone naming where fget_unlocked returns with a ref held, while fget_locked requires a lock but provides nothing in terms of making sure the file lives past unlock. No functional changes.
|
#
e68a5225 |
|
14-Feb-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: add fde_copy To dedup handrolled memcpy. This will be used later to make fd code atomic-clean.
|
#
ec12b4f4 |
|
14-Feb-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: add missing seqc to dupfdopen
|
#
c9a99599 |
|
14-Feb-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
seqc: rename seqc_consistent_nomb to seqc_consistent_no_fence For more consistency with other primitives.
|
#
5c310250 |
|
29-Jan-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: use FILEDESC_FOREACH_{FDE,FP} where appropriate
|
#
809f3121 |
|
29-Jan-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: assign fd_freefile early when copying This is to simplify an upcomming change.
|
#
893d20c9 |
|
29-Jan-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: move fd table sizing out of fdinit now it is placed with the rest of actual initialisation
|
#
4103c3cd |
|
29-Jan-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: drop volatile keyword from refcounts While here move a comment where it belongs and do small whitespace clean up.
|
#
513c7a6e |
|
10-Feb-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: make fget_unlocked take a thread argument Just like other fget routines. This enables embedding fd table pointer in struct thread, avoiding taking a trip through proc.
|
#
45bb8bea |
|
01-Feb-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: elide one acquire fence in fget_unlocked_seq Still validate we got the stable state before returning an error though.
|
#
62849eef |
|
11-Feb-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: split fget_unlocked_seq depending on CAPABILITIES This will simplify an upcoming change.
|
#
b937908e |
|
11-Feb-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: split fget_cap depending on CAPABILITIES This will simplify an upcoming change.
|
#
300cfb96 |
|
07-Feb-2022 |
Mark Johnston <markj@FreeBSD.org> |
file: Make fget*() and getvnode*() consistent about initializing *fpp Most fget*() functions initialize the output parameter to NULL. Make the externally visible interface behave consistently, and make fget_unlocked_seq() private to kern_descrip.c. This fixes at least one bug in a consumer, _filemon_wrapper_openat(), which assumes that getvnode() sets the output file pointer to NULL upon an error. Reported by: syzbot+01c0459408f896a5933a@syzkaller.appspotmail.com Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34190
|
#
36bd49ac |
|
16-Dec-2021 |
Mark Johnston <markj@FreeBSD.org> |
fd: Avoid truncating output buffers for KERN_PROC_{CWD,FILEDESC} These sysctls failed to return an error if the caller had provided too short an output buffer. Change them to return ENOMEM instead, to ensure that callers can detect truncation in the face of a concurrently changing fd table. PR: 228432 Discussed with: cem, jhb MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D15607
|
#
327060bd |
|
16-Dec-2021 |
Mark Johnston <markj@FreeBSD.org> |
fd: Initialize more export_fd_buf fields in kern_proc_cwd_out() In particular, we need to initialize efbuf->flags, since export_vnode_to_sb() loads that field. This was mostly harmless since the flag only determines whether the output kinfo_file is packed, and KERN_PROC_CWD only ever emits a single kinfo_file anyway. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
|
#
794d3e8e |
|
05-Dec-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
fcntl(2): add F_KINFO operation that returns struct kinfo_file for the given file descriptor. Among other data, it also returns kf_path, if file op was able to restore file path. Reviewed by: jhb, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33277
|
#
6e51d61a |
|
06-Dec-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Add declaration for static export_file_to_kinfo() Reviewed by: jhb, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33277
|
#
6eefabd4 |
|
22-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: improve nstat, nfstat, nlstat Optionally return errors when truncating dev_t, ino_t, and nlink_t. In the interest of code reuse, use freebsd11_cvtstat() to perform the truncation and error handling and then convert the resulting struct freebsd11_stat to struct nstat. Add missing freebsd32 compat syscalls. These syscalls require translation because struct nstat contains four instances of struct timespec which in turn contains a time_t and a long. Reviewed by: kib
|
#
be10c0a9 |
|
03-Nov-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
fexecve(2): allow O_PATH file descriptors opened without O_EXEC This improves compatibility with Linux. Noted by: Drew DeVault <sir@cmpwn.com> Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32821
|
#
7dd419ca |
|
26-Sep-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
cache: add empty path support This avoids spurious drop offs as EMPTY is passed regardless of the actual path name. Pushign the work inside the lookup instead of just ignorign the flag allows avoid checking for empty pathname for all other lookups.
|
#
2b68eb8e |
|
01-Oct-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove thread argument from VOP_STAT and fo_stat.
|
#
a0558fe9 |
|
28-Apr-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
Retire code added to support CloudABI CloudABI was removed in cf0ee8738e31aa9e6fbf4dca4dac56d89226a71a
|
#
85c855d3 |
|
29-Sep-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: add pwd_hold_proc
|
#
d71e1a88 |
|
25-Sep-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
fifo: support flock This evens it up with Linux. Original patch by: Greg V <greg@unrelenting.technology> Differential Revision: https://reviews.freebsd.org/D24255#565302
|
#
7326e858 |
|
28-Aug-2021 |
Mark Johnston <markj@FreeBSD.org> |
fsetown: Avoid process group lock recursion Restore the pre-1d874ba4f8ba behaviour of disassociating the current SIGIO recipient before looking up the specified process or process group. This avoids a lock recursion in the scenario where a process group is configured to receive SIGIO for an fd when it has already been so configured. Reported by: pho Tested by: pho Reviewed by: kib MFC after: 3 days
|
#
a507a40f |
|
25-Aug-2021 |
Mark Johnston <markj@FreeBSD.org> |
fsetown: Simplify error handling No functional change intended. Suggested by: kib Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31671
|
#
1d874ba4 |
|
25-Aug-2021 |
Mark Johnston <markj@FreeBSD.org> |
fsetown: Fix process lookup bugs - pget()/pfind() will acquire the PID hash bucket locks, which are sleepable sx locks, but this means that the sigio mutex cannot be held while calling these functions. Instead, use pget() to hold the process, after which we lock the sigio and proc locks, respectively. - funsetownlst() assumes that processes cannot be registered for SIGIO once they have P_WEXIT set. However, pfind() will happily return exiting processes, breaking the invariant. Add an explicit check for P_WEXIT in fsetown() to fix this. [1] Fixes: f52979098d3c ("Fix a pair of races in SIGIO registration") Reported by: syzkaller [1] Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31661
|
#
0dcef81d |
|
23-Jul-2021 |
Mark Johnston <markj@FreeBSD.org> |
Add required sysctl name length checks to various handlers Reported by: KMSAN MFC after: 1 week Sponsored by: The FreeBSD Foundation
|
#
9bfddb3a |
|
27-May-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: use PROC_WAIT_UNLOCKED when clearing p_fd/p_pd
|
#
1762f674 |
|
14-May-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
ktrace: pack all ktrace parameters into allocated structure ktr_io_params Ref-count the ktr_io_params structure instead of vnode/cred. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30257
|
#
70c05850 |
|
14-May-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
kern_descrip.c: Style Wrap too long lines. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30257
|
#
bbf7a4e8 |
|
07-Apr-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
O_PATH: allow vnode kevent filter on such files if VREAD access is checked as allowed during open Requested by: wulf Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323
|
#
a5970a52 |
|
03-Apr-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Make files opened with O_PATH to not block non-forced unmount by only keeping hold count on the vnode, instead of the use count. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323
|
#
8d9ed174 |
|
17-Mar-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
open(2): Implement O_PATH Reviewed by: markj Tested by: pho Discussed with: walker.aj325_gmail.com, wulf Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323
|
#
42be0a7b |
|
17-Mar-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Style. Add missed spaces, wrap long lines. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323
|
#
fa323503 |
|
23-Feb-2021 |
Alex Richardson <arichardson@FreeBSD.org> |
close_range: add audit support This fixes the closefrom test in sys/audit. Includes cherry-picks of the following commits from openbsm: https://github.com/openbsm/openbsm/commit/4dfc628aafe589d68848f7033f3d3488c4d979e0 https://github.com/openbsm/openbsm/commit/99ff6fe32aebc5a4b8d40d60062b8574697df557 https://github.com/openbsm/openbsm/commit/da48a0399e95448693d3fa2be48454ca564c1be8 Reviewed By: kevans Differential Revision: https://reviews.freebsd.org/D28388
|
#
d4380c0c |
|
19-Feb-2021 |
Jamie Gritton <jamie@FreeBSD.org> |
jail: Change both root and working directories in jail_attach(2) jail_attach(2) performs an internal chroot operation, leaving it up to the calling process to assure the working directory is inside the jail. Add a matching internal chdir operation to the jail's root. Also ignore kern.chroot_allow_open_directories, and always disallow the operation if there are any directory descriptors open. Reported by: mjg Approved by: markj, kib MFC after: 3 days
|
#
0482d7c9 |
|
15-Feb-2021 |
Alex Richardson <arichardson@FreeBSD.org> |
Fix fget_only_user() to return ENOTCAPABLE on a failed capsicum check After eaad8d1303da500ed691bd774742a4555a05e729 four additional capsicum-test tests started failing. It turns out this is because fget_only_user() was returning EBADF on a failed capsicum check instead of forwarding the return value of cap_check_inline() like fget_unlocked_seq(). capsicum-test failures before this: ``` [ FAILED ] 7 tests, listed below: [ FAILED ] Capability.OperationsForked [ FAILED ] Capability.NoBypassDAC [ FAILED ] Pdfork.OtherUserForked [ FAILED ] PipePdfork.WildcardWait [ FAILED ] OpenatTest.WithFlag [ FAILED ] ForkedOpenatTest_WithFlagInCapabilityMode._ [ FAILED ] Select.LotsOFileDescriptorsForked ``` After: ``` [ FAILED ] 3 tests, listed below: [ FAILED ] Capability.NoBypassDAC [ FAILED ] Pdfork.OtherUserForked [ FAILED ] PipePdfork.WildcardWait ``` Reviewed By: mjg MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D28691
|
#
eaad8d13 |
|
28-Jan-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: add fget_only_user This can be used by single-threaded processes which don't share a file descriptor table to access their file objects without having to reference them. For example select consumers tend to match the requirement and have several file descriptors to inspect.
|
#
5753be8e |
|
13-Jan-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: add refcount argument to falloc_noinstall This lets callers avoid atomic ops by initializing the count to required value from the get go. While here add falloc_abort to backpedal from this without having to fdrop.
|
#
530b699a |
|
12-Jan-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: add finstall_refed Can be used to consume an already existing reference and consequently avoid atomic ops.
|
#
4faa375c |
|
12-Jan-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: provide a dedicated closef variant for unix socket code This avoids testing for td != NULL.
|
#
71bd18d3 |
|
06-Jan-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: use seqc_read_notmodify when translating fds
|
#
20ac5cda |
|
23-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: make fd/fp mandatory They are both always passed anyway.
|
#
bb3a12f0 |
|
28-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: inline pwd_get_smr Tested by: pho
|
#
7a202823 |
|
23-Dec-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Expose eventfd in the native API/ABI using a new __specialfd syscall eventfd is a Linux system call that produces special file descriptors for event notification. When porting Linux software, it is currently usually emulated by epoll-shim on top of kqueues. Unfortunately, kqueues are not passable between processes. And, as noted by the author of epoll-shim, even if they were, the library state would also have to be passed somehow. This came up when debugging strange HW video decode failures in Firefox. A native implementation would avoid these problems and help with porting Linux software. Since we now already have an eventfd implementation in the kernel (for the Linuxulator), it's pretty easy to expose it natively, which is what this patch does. Submitted by: greg@unrelenting.technology Reviewed by: markj (previous version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D26668
|
#
57efe26b |
|
17-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: reimplement close_range to avoid spurious relocking
|
#
08a5615c |
|
17-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
audit: rework AUDIT_SYSCLOSE This in particular avoids spurious lookups on close.
|
#
1e71e7c4 |
|
17-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: refactor closefp in preparation for close_range rework
|
#
08241fed |
|
16-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: remove redundant saturation check from fget_unlocked_seq refcount_acquire_if_not_zero returns true on saturation. The case of 0 is handled by looping again, after which the originally found pointer will no longer be there. Noted by: kib
|
#
edcdcefb |
|
13-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: fix fdrop prediction when closing a fd Most of the time this is the last reference, contrary to typical fdrop use.
|
#
0ecce93d |
|
10-Dec-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: make serialization in fdescfree_fds conditional on hold count p_fd nullification in fdescfree serializes against new threads transitioning the count 1 -> 2, meaning that fdescfree_fds observing the count of 1 can safely assume there is nobody else using the table. Losing the race and observing > 1 is harmless. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D27522
|
#
3309fa74 |
|
09-Dec-2020 |
Mark Johnston <markj@FreeBSD.org> |
Plug a race between fd table teardown and several loops To export information from fd tables we have several loops which do this: FILDESC_SLOCK(fdp); for (i = 0; fdp->fd_refcount > 0 && i <= lastfile; i++) <export info for fd i>; FILDESC_SUNLOCK(fdp); Before r367777, fdescfree() acquired the fd table exclusive lock between decrementing fdp->fd_refcount and freeing table entries. This serialized with the loop above, so the file at descriptor i would remain valid until the lock is dropped. Now there is no serialization, so the loops may race with teardown of file descriptor tables. Acquire the exclusive fdtable lock after releasing the final table reference to provide a barrier synchronizing with these loops. Reported by: pho Reviewed by: kib (previous version), mjg Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27513
|
#
4c1c90ea |
|
09-Dec-2020 |
Mark Johnston <markj@FreeBSD.org> |
Use refcount_load(9) to load fd table reference counts No functional change intended. Reviewed by: kib, mjg Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27512
|
#
c7ef3490 |
|
24-Nov-2020 |
Kyle Evans <kevans@FreeBSD.org> |
kern: never restart syscalls calling closefp(), e.g. close(2) All paths leading into closefp() will either replace or remove the fd from the filedesc table, and closefp() will call fo_close methods that can and do currently sleep without regard for the possibility of an ERESTART. This can be dangerous in multithreaded applications as another thread could have opened another file in its place that is subsequently operated on upon restart. The following are seemingly the only ones that will pass back ERESTART in-tree: - sockets (SO_LINGER) - fusefs - nfsclient Reviewed by: jilles, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D27310
|
#
f96078b8 |
|
22-Nov-2020 |
Kyle Evans <kevans@FreeBSD.org> |
kern: dup: do not assume oldfde is valid oldfde may be invalidated if the table has grown due to the operation that we're performing, either via fdalloc() or a direct fdgrowtable_exp(). This was technically OK before rS367927 because the old table remained valid until the filedesc became unused, but now it may be freed immediately if it's an unshared table in a single-threaded process, so it is no longer a good assumption to make. This fixes dup/dup2 invocations that grow the file table; in the initial report, it manifested as a kernel panic in devel/gmake's configure script. Reported by: Guy Yur <guyyur gmail com> Reviewed by: rew Differential Revision: https://reviews.freebsd.org/D27319
|
#
3c85ca21 |
|
21-Nov-2020 |
Robert Wing <rew@FreeBSD.org> |
fd: free old file descriptor tables when not shared During the life of a process, new file descriptor tables may be allocated. When a new table is allocated, the old table is placed in a free list and held onto until all processes referencing them exit. When a new file descriptor table is allocated, the old file descriptor table can be freed when the current process has a single-thread and the file descriptor table is not being shared with any other processes. Reviewed by: kevans Approved by: kevans (mentor) Differential Revision: https://reviews.freebsd.org/D18617
|
#
85078b85 |
|
17-Nov-2020 |
Conrad Meyer <cem@FreeBSD.org> |
Split out cwd/root/jail, cmask state from filedesc table No functional change intended. Tracking these structures separately for each proc enables future work to correctly emulate clone(2) in linux(4). __FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof. Reviewed by: kib Discussed with: markj, mjg Differential Revision: https://reviews.freebsd.org/D27037
|
#
f5297909 |
|
11-Nov-2020 |
Mark Johnston <markj@FreeBSD.org> |
Fix a pair of races in SIGIO registration First, funsetownlst() list looks at the first element of the list to see whether it's processing a process or a process group list. Then it acquires the global sigio lock and processes the list. However, nothing prevents the first sigio tracker from being freed by a concurrent funsetown() before the sigio lock is acquired. Fix this by acquiring the global sigio lock immediately after checking whether the list is empty. Callers of funsetownlst() ensure that new sigio trackers cannot be added concurrently. Second, fsetown() uses funsetown() to remove an existing sigio structure from a file object. However, funsetown() uses a racy check to avoid the sigio lock, so two threads may call fsetown() on the same file object, both observe that no sigio tracker is present, and enqueue two sigio trackers for the same file object. However, if the file object is destroyed, funsetown() will only remove one sigio tracker, and funsetownlst() may later trigger a use-after-free when it clears the file object reference for each entry in the list. Fix this by introducing funsetown_locked(), which avoids the racy check. Reviewed by: kib Reported by: pho Tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27157
|
#
3c50616f |
|
04-Nov-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: make all f_count uses go through refcount_*
|
#
d737e9ea |
|
04-Nov-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: hide _fdrop 0 count check behind INVARIANTS While here use refcount_load and make sure to report the tested value.
|
#
dd28b379 |
|
09-Oct-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: support lockless dirfd lookups
|
#
4e226610 |
|
05-Oct-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
cache: fix pwd use-after-free in setting up fallback Since the code exits smr section prior to calling pwd_hold, the used pwd can be freed and a new one allocated with the same address, making the comparison erroneously true. Note it is very unlikely anyone ran into it.
|
#
96474d2a |
|
15-Sep-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not copy vp into f_data for DTYPE_VNODE files. The pointer to vnode is already stored into f_vnode, so f_data can be reused. Fix all found users of f_data for DTYPE_VNODE. Provide finit_vnode() helper to initialize file of DTYPE_VNODE type. Reviewed by: markj (previous version) Discussed with: freqlabs (openzfs chunk) Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26346
|
#
54052eda |
|
08-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: fix fhold on an uninitialized var in fdcopy_remapped Reported by: gcc9
|
#
cd4a1797 |
|
22-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: pwd_drop after releasing filedesc lock Fixes a potential LOR against vnode lock.
|
#
e914224a |
|
25-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: put back FILEDESC_SUNLOCK to pwd_hold lost during rebase Reported by: pho
|
#
07d2145a |
|
25-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add the infrastructure for lockless lookup Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25577
|
#
d8bc2a17 |
|
15-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: remove fd_lastfile It keeps recalculated way more often than it is needed. Provide a routine (fdlastfile) to get it if necessary. Consumers may be better off with a bitmap iterator instead.
|
#
7177149a |
|
15-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: add obvious branch predictions to fdalloc
|
#
373278a7 |
|
11-Jul-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: stop looping in pwd_hold We don't expect to fail acquiring the reference unless running into a corner case. Just in case ensure forward progress by taking the lock. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D25616
|
#
f2706588 |
|
21-Jun-2020 |
Thomas Munro <tmunro@FreeBSD.org> |
vfs: track sequential reads and writes separately For software like PostgreSQL and SQLite that sometimes reads sequentially while also writing sequentially some distance behind with interleaved syscalls on the same fd, performance is better on UFS if we do sequential access heuristics separately for reads and writes. Patch originally by Andrew Gierth in 2008, updated and proposed by me with his permission. Reviewed by: mjg, kib, tmunro Approved by: mjg (mentor) Obtained from: Andrew Gierth <andrew@tao11.riddles.org.uk> Differential Revision: https://reviews.freebsd.org/D25024
|
#
21d3be91 |
|
27-Apr-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
pwd: unbreak repeated calls to set_rootvnode Prior to the change the once set pointer would never be updated. Unbreaks reboot -r. Reported by: Ross Gohlke
|
#
7d03e081 |
|
14-Apr-2020 |
Kyle Evans <kevans@FreeBSD.org> |
Mark closefrom(2) COMPAT12, reimplement in libc to wrap close_range Include a temporarily compatibility shim as well for kernels predating close_range, since closefrom is used in some critical areas. Reviewed by: markj (previous version), kib Differential Revision: https://reviews.freebsd.org/D24399
|
#
605c4cda |
|
13-Apr-2020 |
Kyle Evans <kevans@FreeBSD.org> |
close_range/closefrom: fix regression from close_range introduction close_range will clamp the range between [0, fdp->fd_lastfile], but failed to take into account that fdp->fd_lastfile can become -1 if all fds are closed. =-( In this scenario, just return because there's nothing further we can do at the moment. Add a test case for this, fork() and simply closefrom(0) twice in the child; on the second invocation, fdp->fd_lastfile == -1 and will trigger a panic before this change. X-MFC-With: r359836
|
#
472ced39 |
|
12-Apr-2020 |
Kyle Evans <kevans@FreeBSD.org> |
Implement a close_range(2) syscall close_range(min, max, flags) allows for a range of descriptors to be closed. The Python folk have indicated that they would much prefer this interface to closefrom(2), as the case may be that they/someone have special fds dup'd to higher in the range and they can't necessarily closefrom(min) because they don't want to hit the upper range, but relocating them to lower isn't necessarily feasible. sys_closefrom has been rewritten to use kern_close_range() using ~0U to indicate closing to the end of the range. This was chosen rather than requiring callers of kern_close_range() to hold FILEDESC_SLOCK across the call to kern_close_range for simplicity. The flags argument of close_range(2) is currently unused, so any flags set is currently EINVAL. It was added to the interface in Linux so that future flags could be added for, e.g., "halt on first error" and things of this nature. This patch is based on a syscall of the same design that is expected to be merged into Linux. Reviewed by: kib, markj, vangyzen (all slightly earlier revisions) Differential Revision: https://reviews.freebsd.org/D21627
|
#
429537ca |
|
19-Mar-2020 |
Mark Johnston <markj@FreeBSD.org> |
kern_dup(): Call filecaps_free_prep() in a write section. filecaps_free_prep() bzeros the capabilities structure and we need to be careful to synchronize with unlocked readers, which expect a consistent rights structure. Reviewed by: kib, mjg Reported by: syzbot+5f30b507f91ddedded21@syzkaller.appspotmail.com MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24120
|
#
d2222aa0e |
|
07-Mar-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: use smr for managing struct pwd This has a side effect of eliminating filedesc slock/sunlock during path lookup, which in turn removes contention vs concurrent modifications to the fd table. Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D23889
|
#
8d03b99b |
|
01-Mar-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: move vnodes out of filedesc into a dedicated structure The new structure is copy-on-write. With the assumption that path lookups are significantly more frequent than chdirs and chrooting this is a win. This provides stable root and jail root vnodes without the need to reference them on lookup, which in turn means less work on globally shared structures. Note this also happens to fix a bug where jail vnode was never referenced, meaning subsequent access on lookup could run into use-after-free. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23884
|
#
8243063f |
|
01-Mar-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: make fgetvp_rights work without the filedesc lock Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23883
|
#
32a86c44 |
|
14-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: use new capsicum helpers
|
#
8f86349f |
|
14-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: remove no longer needed atomic_load_ptr casts
|
#
6ed30ea4 |
|
14-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: annotate finstall with prediction branches
|
#
0f5f49ef |
|
13-Feb-2020 |
Kyle Evans <kevans@FreeBSD.org> |
u_char -> vm_prot_t in a couple of places, NFC The latter is a typedef of the former; the typedef exists and these bits are representing vmprot values, so use the correct type. Submitted by: sigsys@gmail.com MFC after: 3 days
|
#
1a9fe452 |
|
04-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: always nullify *fdp in fget* routines Some consumers depend on the pointer being NULL if an error is returned. The guarantee got broken in r357469. Reported by: https://syzkaller.appspot.com/bug?extid=0c9b05e2b727aae21eef Noted by: markj
|
#
8151b6e9 |
|
03-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: partially unengrish the previous commit
|
#
e10f063b |
|
03-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: streamline fget_unlocked clang has the unfortunate property of paying little attention to prediction hints when faced with a loop spanning the majority of the rotuine. In particular fget_unlocked has an unlikely corner case where it starts almost from scratch. Faced with this clang generates a maze of taken jumps, whereas gcc produces jump-free code (in the expected case). Work around the problem by providing a variant which only tries once and resorts to calling the original code if anything goes wrong. While here note that the 'seq' parameter is almost never passed, thus the seldom users are redirected to call it directly.
|
#
52604ed7 |
|
03-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: remove the seq argument from fget_unlocked It is almost always NULL.
|
#
7f1566f8 |
|
03-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: remove the seq argument from fget routines It is almost always NULL.
|
#
0a1427c5 |
|
03-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
ktrace: provide ktrstat_error This eliminates a branch from its consumers trading it for an extra call if ktrace is enabled for curthread. Given that this is almost never true, the tradeoff is worth it.
|
#
bcd1cf4f |
|
03-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
capsicum: faster cap_rights_contains Instead of doing a 2 iteration loop (determined at runeimt), take advantage of the fact that the size is already known. While here provdie cap_check_inline so that fget_unlocked does not have to do a function call. Verified with the capsicum suite /usr/tests.
|
#
fee20454 |
|
03-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: fix f_count acquire in fget_unlocked The code was using a hand-rolled fcmpset loop, while in other places the same count is manipulated with the refcount API. This transferred from a stylistic issue into a bug after the API got extended to support flags. As a result the hand-rolled loop could bump the count high enough to set the bit flag. Another bump + refcount_release would then free the file prematurely. The bug is only present in -CURRENT.
|
#
2568d5bb |
|
02-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: sprinkle some predits around fget clang inlines fget -> _fget into kern_fstat and eliminates several checkes, but prior to this change it would assume fget_unlocked was likely to fail and consequently avoidable jumps got generated.
|
#
da4f45ea |
|
02-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: use atomic_load_ptr instead of hand-rolled cast through volatile No change in assembly.
|
#
d3cc5354 |
|
17-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: provide F_ISUNIONSTACK as a kludge for libc Prior to introduction of this op libc's readdir would call fstatfs(2), in effect unnecessarily copying kilobytes of data just to check fs name and a mount flag. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D23162
|
#
b249ce48 |
|
03-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427
|
#
55eb92db |
|
11-Dec-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: static-ize and devolatile openfiles Almost all access is using atomics. The only read is sysctl which should use a whole-int-at-a-time friendly read internally.
|
#
4a7b33ec |
|
02-Oct-2019 |
Mark Johnston <markj@FreeBSD.org> |
Disallow fcntl(F_READAHEAD) when the vnode is not a regular file. The mountpoint may not have defined an iosize parameter, so an attempt to configure readahead on a device file can lead to a divide-by-zero crash. The sequential heuristic is not applied to I/O to or from device files, and posix_fadvise(2) returns an error when v_type != VREG, so perform the same check here. Reported by: syzbot+e4b682208761aa5bc53a@syzkaller.appspotmail.com Reviewed by: kib MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21864
|
#
af755d3e |
|
25-Sep-2019 |
Kyle Evans <kevans@FreeBSD.org> |
[1/3] Add mostly Linux-compatible file sealing support File sealing applies protections against certain actions (currently: write, growth, shrink) at the inode level. New fileops are added to accommodate seals - EINVAL is returned by fcntl(2) if they are not implemented. Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D21391
|
#
f1cf2b9d |
|
21-Jul-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Check and avoid overflow when incrementing fp->f_count in fget_unlocked() and fhold(). On sufficiently large machine, f_count can be legitimately very large, e.g. malicious code can dup same fd up to the per-process filedescriptors limit, and then fork as much as it can. On some smaller machine, I see kern.maxfilesperproc: 939132 kern.maxprocperuid: 34203 which already overflows u_int. More, the malicious code can create transient references by sending fds over unix sockets. I realized that this check is missed after reading https://secfault-security.com/blog/FreeBSD-SA-1902.fd.html Reviewed by: markj (previous version), mjg Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D20947
|
#
7c3703a6 |
|
29-Jun-2019 |
Mark Johnston <markj@FreeBSD.org> |
Use a consistent snapshot of the fd's rights in fget_mmap(). fget_mmap() translates rights on the descriptor to a VM protection mask. It was doing so without holding any locks on the descriptor table, so a writer could simultaneously be modifying those rights. Such a situation would be detected using a sequence counter, but not before an inconsistency could trigger assertion failures in the capability code. Fix the problem by copying the fd's rights to a structure on the stack, and perform the translation only once we know that that snapshot is consistent. Reported by: syzbot+ae359438769fda1840f8@syzkaller.appspotmail.com Reviewed by: brooks, mjg MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20800
|
#
38b06f8a |
|
20-Jun-2019 |
Alan Somers <asomers@FreeBSD.org> |
fcntl: fix overflow when setting F_READAHEAD VOP_READ and VOP_WRITE take the seqcount in blocks in a 16-bit field. However, fcntl allows you to set the seqcount in bytes to any nonnegative 31-bit value. The result can be a 16-bit overflow, which will be sign-extended in functions like ffs_read. Fix this by sanitizing the argument in kern_fcntl. As a matter of policy, limit to IO_SEQMAX rather than INT16_MAX. Also, fifos have overloaded the f_seqcount field for a completely different purpose ever since r238936. Formalize that by using a union type. Reviewed by: cem MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20710
|
#
bc2d137a |
|
22-May-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Make pack_kinfo() available for external callers. Reviewed by: jilles, tmunro Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D20258
|
#
fd76e780 |
|
25-Mar-2019 |
Mark Johnston <markj@FreeBSD.org> |
Reject F_SETLK_REMOTE commands when sysid == 0. A sysid of 0 denotes the local system, and some handlers for remote locking commands do not attempt to deal with local locks. Note that F_SETLK_REMOTE is only available to privileged users as it is intended to be used as a testing interface. Reviewed by: kib Reported by: syzbot+9c457a6ae014a3281eb8@syzkaller.appspotmail.com MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D19702
|
#
55fda581 |
|
27-Feb-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
Rename seq to seqc to avoid namespace clashes with Linux Linux generates the content of procfs files using a mechanism prefixed with seq_*. This in particular came up with recent gcov import. Sponsored by: The FreeBSD Foundation
|
#
ebe0b35a |
|
23-Feb-2019 |
Matt Macy <mmacy@FreeBSD.org> |
Change seq_read to seq_load to avoid namespace conflicts with lkpi MFC after: 1 week Sponsored by: iX Systems
|
#
093295ae |
|
20-Feb-2019 |
Mark Johnston <markj@FreeBSD.org> |
Remove an obsolete comment. MFC after: 3 days
|
#
24d64be4 |
|
13-Dec-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: mostly depessimize NDINIT_ALL 1) filecaps_init was unnecesarily a function call 2) an asignment at the end was preventing tail calling of cap_rights_init Sponsored by: The FreeBSD Foundation
|
#
6b2d6113 |
|
10-Dec-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: dedup code in sys_getdtablesize Sponsored by: The FreeBSD Foundation
|
#
86db4d40 |
|
11-Dec-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: tidy up closing a fd - avoid a call to knote_close in the common case - annotate mqueue as unlikely Sponsored by: The FreeBSD Foundation
|
#
663de816 |
|
11-Dec-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: stop looking for exact freefile after allocation If a lower fd is closed later, the lookup goes to waste. Allocation always performs the lookup anyway. Sponsored by: The FreeBSD Foundation
|
#
08d005e6 |
|
07-Dec-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: use racct_set_unlocked Sponsored by: The FreeBSD Foundation
|
#
82f4b826 |
|
07-Dec-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: try do less work with the lock in dup Sponsored by: The FreeBSD Foundation
|
#
d47f3fdb |
|
29-Nov-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: unify fd range check across the routines While here annotate out of range as unlikely. Sponsored by: The FreeBSD Foundation
|
#
98fca94d |
|
12-Oct-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
capsicum: provide cap_rights_fde_inline Reading caps is in the hot path (on each successful fd lookup), but completely unnecessarily requires a function call. Approved by: re (gjb) Sponsored by: The FreeBSD Foundation
|
#
51e13c93 |
|
20-Sep-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: prevent inlining of _fdrop thorough kern_descrip.c fdrop is used in several places in the file and almost never has to call _fdrop. Thus inlining it is a pure waste of space. Approved by: re (kib)
|
#
bcbc8d35 |
|
12-Jul-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: stop passing M_ZERO to uma_zalloc The optimisation seen with malloc cannot be used here as zone sizes are now known at compilation. Thus bzero by hand to get the optimisation instead.
|
#
3a20f06a |
|
10-Jul-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Use uintptr_t alone when assigning to kvaddr_t variables. Suggested by: jhb
|
#
7524b4c1 |
|
06-Jul-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Correct breakage on 32-bit platforms from r335979.
|
#
f38b68ae |
|
05-Jul-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Make struct xinpcb and friends word-size independent. Replace size_t members with ksize_t (uint64_t) and pointer members (never used as pointers in userspace, but instead as unique idenitifiers) with kvaddr_t (uint64_t). This makes the structs identical between 32-bit and 64-bit ABIs. On 64-bit bit systems, the ABI is maintained. On 32-bit systems, this is an ABI breaking change. The ABI of most of these structs was previously broken in r315662. This also imposes a small API change on userspace consumers who must handle kernel pointers becoming virtual addresses. PR: 228301 (exp-run by antoine) Reviewed by: jtl, kib, rwatson (various versions) Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15386
|
#
b8d908b7 |
|
01-Jun-2018 |
Ed Maste <emaste@FreeBSD.org> |
ANSIfy sys/kern
|
#
acbde298 |
|
18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
capsicum: propagate const correctness
|
#
cbd92ce6 |
|
09-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
Eliminate the overhead of gratuitous repeated reinitialization of cap_rights - Add macros to allow preinitialization of cap_rights_t. - Convert most commonly used code paths to use preinitialized cap_rights_t. A 3.6% speedup in fstat was measured with this change. Reported by: mjg Reviewed by: oshogbo Approved by: sbruno MFC after: 1 month
|
#
748ff486 |
|
04-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
`dup1_processes -t 96 -s 5` on a dual 8160 x dup_before + dup_after +------------------------------------------------------------+ | x + | |x x x x ++ ++| | |____AM___| |AM|| +------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 1.514954e+08 1.5230351e+08 1.5206157e+08 1.5199371e+08 341205.71 + 5 1.5494336e+08 1.5519569e+08 1.5511982e+08 1.5508323e+08 96232.829 Difference at 95.0% confidence 3.08952e+06 +/- 365604 2.03266% +/- 0.245071% (Student's t, pooled s = 250681) Reported by: mjg@ MFC after: 1 week
|
#
7d853f62 |
|
22-Apr-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
lockf: slightly depessimize 1. check if P_ADVLOCK is already set and if so, don't lock to set it (stolen from DragonFly) 2. when trying for fast path unlock, check that we are doing unlock first instead of taking the interlock for no reason (e.g. if we want to *lock*). whilere make it more likely that falling fast path will not take the interlock either by checking for state Note the code is severely pessimized both single- and multithreaded.
|
#
8ce99bb4 |
|
17-Apr-2018 |
John Baldwin <jhb@FreeBSD.org> |
Properly do a deep copy of the ioctls capability array for fget_cap(). fget_cap() tries to do a cheaper snapshot of a file descriptor without holding the file descriptor lock. This snapshot does not do a deep copy of the ioctls capability array, but instead uses a different return value to inform the caller to retry the copy with the lock held. However, filecaps_copy() was returning 1 to indicate that a retry was required, and fget_cap() was checking for 0 (actually '!filecaps_copy()'). As a result, fget_cap() did not do a deep copy of the ioctls array and just reused the original pointer. This cause multiple file descriptor entries to think they owned the same pointer and eventually resulted in duplicate frees. The only code path that I'm aware of that triggers this is to create a listen socket that has a restricted list of ioctls and then call accept() which calls fget_cap() with a valid filecaps structure from getsock_cap(). To fix, change the return value of filecaps_copy() to return true if it succeeds in copying the caps and false if it fails because the lock is required. I find this more intuitive than fixing the caller in this case. While here, change the return type from 'int' to 'bool'. Finally, make filecaps_copy() more robust in the failure case by not copying any of the source filecaps structure over. This avoids the possibility of leaking a pointer into a structure if a similar future caller doesn't properly handle the return value from filecaps_copy() at the expense of one more branch. I also added a test case that panics before this change and now passes. Reviewed by: kib Discussed with: mjg (not a fan of the extra branch) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D15047
|
#
6469bdcd |
|
06-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Move most of the contents of opt_compat.h to opt_global.h. opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941
|
#
179da98f |
|
27-Mar-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: tighten seq protected areas to not contain malloc/free
|
#
b1288166 |
|
17-Jan-2018 |
John Baldwin <jhb@FreeBSD.org> |
Use long for the last argument to VOP_PATHCONF rather than a register_t. pathconf(2) and fpathconf(2) both return a long. The kern_[f]pathconf() functions now accept a pointer to a long value rather than modifying td_retval directly. Instead, the system calls explicitly store the returned long value in td_retval[0]. Requested by: bde Reviewed by: kib Sponsored by: Chelsio Communications
|
#
dd688800 |
|
19-Dec-2017 |
John Baldwin <jhb@FreeBSD.org> |
Add a custom VOP_PATHCONF method for fdescfs. The method handles NAME_MAX and LINK_MAX explicitly. For all other pathconf variables, the method passes the request down to the underlying file descriptor. This requires splitting a kern_fpathconf() syscallsubr routine out of sys_fpathconf(). Also, to avoid lock order reversals with vnode locks, the fdescfs vnode is unlocked around the call to kern_fpathconf(), but with the usecount of the vnode bumped. MFC after: 1 month Sponsored by: Chelsio Communications
|
#
51369649 |
|
20-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
|
#
76f2c272 |
|
14-Jun-2017 |
Ryan Libby <rlibby@FreeBSD.org> |
ddb show files: fix up file types and whitespace This makes ddb show files more descriptive and also adjusts the whitespace to align the columns for non-32-bit architectures. Reviewed by: cem (previous version), jhb Approved by: markj (mentor) Differential Revision: https://reviews.freebsd.org/D11061
|
#
3df7ebc4 |
|
05-Jun-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Add sysctl vfs.ino64_trunc_error controlling action on truncating inode number or link count for the ABI compat binaries. Right now, and by default after the change, too large 64bit values are silently truncated to 32 bits. Enabling the knob causes the system to return EOVERFLOW for stat(2) family of compat syscalls when some values cannot be completely represented by the old structures. For getdirentries(2), knob skips the dirents which would cause non-trivial truncation of d_ino. EOVERFLOW error is specified by the X/Open 1996 LFS document ('Adding Support for Arbitrary File Sizes to the Single UNIX Specification'). Based on the discussion with: bde Sponsored by: The FreeBSD Foundation
|
#
69921123 |
|
23-May-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Commit the 64-bit inode project. Extend the ino_t, dev_t, nlink_t types to 64-bit ints. Modify struct dirent layout to add d_off, increase the size of d_fileno to 64-bits, increase the size of d_namlen to 16-bits, and change the required alignment. Increase struct statfs f_mntfromname[] and f_mntonname[] array length MNAMELEN to 1024. ABI breakage is mitigated by providing compatibility using versioned symbols, ingenious use of the existing padding in structures, and by employing other tricks. Unfortunately, not everything can be fixed, especially outside the base system. For instance, third-party APIs which pass struct stat around are broken in backward and forward incompatible ways. Kinfo sysctl MIBs ABI is changed in backward-compatible way, but there is no general mechanism to handle other sysctl MIBS which return structures where the layout has changed. It was considered that the breakage is either in the management interfaces, where we usually allow ABI slip, or is not important. Struct xvnode changed layout, no compat shims are provided. For struct xtty, dev_t tty device member was reduced to uint32_t. It was decided that keeping ABI compat in this case is more useful than reporting 64-bit dev_t, for the sake of pstat. Update note: strictly follow the instructions in UPDATING. Build and install the new kernel with COMPAT_FREEBSD11 option enabled, then reboot, and only then install new world. Credits: The 64-bit inode project, also known as ino64, started life many years ago as a project by Gleb Kurtsou (gleb). Kirk McKusick (mckusick) then picked up and updated the patch, and acted as a flag-waver. Feedback, suggestions, and discussions were carried by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles), and Rick Macklem (rmacklem). Kris Moore (kris) performed an initial ports investigation followed by an exp-run by Antoine Brodin (antoine). Essential and all-embracing testing was done by Peter Holm (pho). The heavy lifting of coordinating all these efforts and bringing the project to completion were done by Konstantin Belousov (kib). Sponsored by: The FreeBSD Foundation (emaste, kib) Differential revision: https://reviews.freebsd.org/D10439
|
#
69cfbe88 |
|
06-Apr-2017 |
Conrad Meyer <cem@FreeBSD.org> |
kern_descrip: Move kinfo_ofile size assert under COMPAT_FREEBSD7 The size and structure are not used outside of FreeBSD 7 compatibility ABIs. Sponsored by: Dell EMC Isilon
|
#
3a2f2825 |
|
04-Feb-2017 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: switch fget_unlocked to atomic_fcmpset
|
#
3071469d |
|
29-Jan-2017 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: sprinkle __read_mostly and __exclusive_cache_line
|
#
b4b4b530 |
|
28-Jan-2017 |
Baptiste Daroussin <bapt@FreeBSD.org> |
Revert crap accidentally committed
|
#
814aaaa7 |
|
28-Jan-2017 |
Baptiste Daroussin <bapt@FreeBSD.org> |
Revert r312923 a better approach will be taken later
|
#
4fce19da |
|
13-Jan-2017 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Remove deprecated fgetsock() and fputsock().
|
#
d4db49c4 |
|
01-Jan-2017 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: access openfiles once in falloc_noinstall This is similar to what's done with nprocs. Note this is only a band aid.
|
#
0b3b55a0 |
|
29-Dec-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
Remove cpu_spinwait after seq_consistent. It does not add any benefit as the read routine will do it as necessary.
|
#
5afb134c |
|
12-Dec-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add vrefact, to be used when the vnode has to be already active This allows blind increment of relevant counters which under contention is cheaper than inc-not-zero loops at least on amd64. Use it in some of the places which are guaranteed to see already active vnodes. Reviewed by: kib (previous version)
|
#
1279fdaf |
|
21-Nov-2016 |
Robert Watson <rwatson@FreeBSD.org> |
Audit 'fd' and 'cmd' arguments to fcntl(2), and when generating BSM, always audit the file-descriptor number and vnode information for all fnctl(2) commands, not just locking-related ones. This was likely an oversight in the original adaptation of this code from XNU. MFC after: 3 days Sponsored by: DARPA, AFRL
|
#
1c8260b6 |
|
24-Sep-2016 |
Julian Elischer <julian@FreeBSD.org> |
Give the user a clue as to which process hit maxfiles. MFC after: 1 week Sponsored by: Panzura
|
#
ad5e83dd |
|
23-Sep-2016 |
Mariusz Zaborski <oshogbo@FreeBSD.org> |
fd: fix up fget_cap If the kernel is not compiled with the CAPABILITIES kernel options fget_unlocked doesn't return the sequence number so fd_modify will always report modification, in that case we got infinity loop. Reported by: br Reviewed by: mjg Tested by: br, def
|
#
deffc4a0 |
|
23-Sep-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: fix up fgetvp_rights after r306184 fget_cap_locked returns a referenced file, but the fgetvp_rights does not need it. Instead, due to the filedesc lock being held, it can ref the vnode after the file was looked up. Fix up fget_cap_locked to be consistent with other _locked helpers and not ref the file. This plugs a leak introduced in r306184. Pointy hat to: mjg, oshogbo
|
#
6490bc65 |
|
22-Sep-2016 |
Mariusz Zaborski <oshogbo@FreeBSD.org> |
fd: simplify fgetvp_rights by using fget_cap_locked Reviewed by: mjg
|
#
69a28758 |
|
15-Sep-2016 |
Ed Maste <emaste@FreeBSD.org> |
Renumber license clauses in sys/kern to avoid skipping #3
|
#
6e70b4f0 |
|
12-Sep-2016 |
Mariusz Zaborski <oshogbo@FreeBSD.org> |
fd: add fget_cap and fget_cap_locked primitives They can be used to obtain capabilities along with a referenced fp. Reviewed by: mjg@
|
#
dd38731e |
|
31-Aug-2016 |
Ed Maste <emaste@FreeBSD.org> |
allow kern.proc.nfds sysctl in capability mode Reviewed by: allanjude MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D7733
|
#
4cbafea0 |
|
30-Aug-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: add fdeget_locked and use in kern_descrip
|
#
382172be |
|
10-Aug-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
sigio: do a lockless check in funsetownlist There is no need to grab the lock first to see if sigio is used, and it typically is not.
|
#
51d1f690 |
|
10-Jul-2016 |
Robert Watson <rwatson@FreeBSD.org> |
Audit file-descriptor arguments to I/O system calls such as read(2), write(2), dup(2), and mmap(2). This auditing is not required by the Common Criteria (and hence was not being performed), but is valuable in both contemporary live analysis and forensic use cases. MFC after: 3 days Sponsored by: DARPA, AFRL
|
#
2dbdf49c |
|
27-May-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: provide a common exit point for unlock in kern_dup While here assert dropped filedesc lock on return from closefp.
|
#
0cfe1a1f |
|
07-May-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: assert dropped filedesc lock in fdcloseexec
|
#
e3043798 |
|
29-Apr-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/kern: spelling fixes in comments. No functional change.
|
#
9c64cfe5 |
|
29-Mar-2016 |
Gleb Smirnoff <glebius@FreeBSD.org> |
The sendfile(2) allows to send extra data from userspace before the file data (headers). Historically the size of the headers was not checked against the socket buffer space. Application could easily overcommit the socket buffer space. With the new sendfile (r293439) the problem remained, but a KASSERT was inserted that checked that amount of data written to the socket matches its space. In case when size of headers is bigger that socket space, KASSERT fires. Without INVARIANTS the new sendfile won't panic, but would report incorrect amount of bytes sent. o With this change, the headers copyin is moved down into the cycle, after the sbspace() check. The uio size is trimmed by socket space there, which fixes the overcommit problem and its consequences. o The compatibility handling for FreeBSD 4 sendfile headers API is pushed up the stack to syscall wrappers. This required a copy and paste of the code, but in turn this allowed to remove extra stack carried parameter from fo_sendfile_t, and embrace entire compat code into #ifdef. If in future we got more fo_sendfile_t function, the copy and paste level would even reduce. Reviewed by: emax, gallatin, Maxim Dounin <mdounin mdounin.ru> Tested by: Vitalij Satanivskij <satan ukr.net> Sponsored by: Netflix
|
#
399e8c17 |
|
09-Mar-2016 |
John Baldwin <jhb@FreeBSD.org> |
Simplify AIO initialization now that it is standard. - Mark AIO system calls as STD and remove the helpers to dynamically register them. - Use COMPAT6 for the old system calls with the older sigevent instead of an 'o' prefix. - Simplify the POSIX configuration to note that AIO is always available. - Handle AIO in the default VOP_PATHCONF instead of special casing it in the pathconf() system call. fpathconf() is still hackish. - Remove freebsd32_aio_cancel() as it just called the native one directly. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D5589
|
#
b577e693 |
|
06-Nov-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: implement kern.proc.nfds sysctl Intended purpose is to provide an equivalent of OpenBSD's getdtablecount syscall for the compat library..
|
#
9af8c8b7 |
|
07-Sep-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: make rights a mandatory argument to fgetvp_rights The only caller already always passes rights.
|
#
d7832811 |
|
07-Sep-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: make the common case in filecaps_copy work lockless The filedesc lock is only needed if ioctls caps are present, which is a rare situation. This is a step towards reducing the scope of the filedesc lock.
|
#
14bdbaf2 |
|
03-Sep-2015 |
Conrad Meyer <cem@FreeBSD.org> |
Detect badly behaved coredump note helpers Coredump notes depend on being able to invoke dump routines twice; once in a dry-run mode to get the size of the note, and another to actually emit the note to the corefile. When a note helper emits a different length section the second time around than the length it requested the first time, the kernel produces a corrupt coredump. NT_PROCSTAT_FILES output length, when packing kinfo structs, is tied to the length of filenames corresponding to vnodes in the process' fd table via vn_fullpath. As vnodes may move around during dump, this is racy. So: - Detect badly behaved notes in putnote() and pad underfilled notes. - Add a fail point, debug.fail_point.fill_kinfo_vnode__random_path to exercise the NT_PROCSTAT_FILES corruption. It simply picks random lengths to expand or truncate paths to in fo_fill_kinfo_vnode(). - Add a sysctl, kern.coredump_pack_fileinfo, to allow users to disable kinfo packing for PROCSTAT_FILES notes. This should avoid both FILES note corruption and truncation, even if filenames change, at the cost of about 1 kiB in padding bloat per open fd. Document the new sysctl in core.5. - Fix note_procstat_files to self-limit in the 2nd pass. Since sometimes this will result in a short write, pad up to our advertised size. This addresses note corruption, at the risk of sometimes truncating the last several fd info entries. - Fix NT_PROCSTAT_FILES consumers libutil and libprocstat to grok the zero padding. With suggestions from: bjk, jhb, kib, wblock Approved by: markj (mentor) Relnotes: yes Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3548
|
#
7e8f566c |
|
02-Sep-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: remove UMA_ZONE_ZINIT argument from Files zone Originally it was added in order to prevent trashing of objects with INVARIANTS enabled. The same effect is now provided with mere UMA_ZONE_NOFREE. This reverts r286921. Discussed with: kib
|
#
fe5ec54b |
|
19-Aug-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
fget_unlocked() depends on the freed struct file f_count field being zero. The file_zone if no-free, but r284861 added trashing of the freed memory. Most visible manifestation of the issue were 'memory modified after free' panics for the file zone, triggered from falloc_noinstall(). Add UMA_ZONE_ZINIT flag to turn off trashing. Mjg noted that it makes sense to not trash freed memory for any non-free zone, which will be done later. Reported and tested by: pho Discussed with: mjg Sponsored by: The FreeBSD Foundation
|
#
e555b430 |
|
29-Jul-2015 |
Ed Schouten <ed@FreeBSD.org> |
Introduce falloc_caps() to create descriptors with capabilties in place. falloc_noinstall() followed by finstall() allows you to create and install file descriptors with custom capabilities. Add falloc_caps() that can do both of these actions in one go. This will be used by CloudABI to create pipes with custom capabilities. Reviewed by: mjg
|
#
2919a0c5 |
|
16-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: partially deduplicate fdescfree and fdescfree_remapped This also moves vrele of cdir/rdir/jdir vnodes earlier, which should not matter.
|
#
457f7e23 |
|
16-Jul-2015 |
Ed Schouten <ed@FreeBSD.org> |
Implement CloudABI's exec() call. Summary: In a runtime that is purely based on capability-based security, there is a strong emphasis on how programs start their execution. We need to make sure that we execute an new program with an exact set of file descriptors, ensuring that credentials are not leaked into the process accidentally. Providing the right file descriptors is just half the problem. There also needs to be a framework in place that gives meaning to these file descriptors. How does a CloudABI mail server know which of the file descriptors corresponds to the socket that receives incoming emails? Furthermore, how will this mail server acquire its configuration parameters, as it cannot open a configuration file from a global path on disk? CloudABI solves this problem by replacing traditional string command line arguments by tree-like data structure consisting of scalars, sequences and mappings (similar to YAML/JSON). In this structure, file descriptors are treated as a first-class citizen. When calling exec(), file descriptors are passed on to the new executable if and only if they are referenced from this tree structure. See the cloudabi-run(1) man page for more details and examples (sysutils/cloudabi-utils). Fortunately, the kernel does not need to care about this tree structure at all. The C library is responsible for serializing and deserializing, but also for extracting the list of referenced file descriptors. The system call only receives a copy of the serialized data and a layout of what the new file descriptor table should look like: int proc_exec(int execfd, const void *data, size_t datalen, const int *fds, size_t fdslen); This change introduces a set of fd*_remapped() functions: - fdcopy_remapped() pulls a copy of a file descriptor table, remapping all of the file descriptors according to the provided mapping table. - fdinstall_remapped() replaces the file descriptor table of the process by the copy created by fdcopy_remapped(). - fdescfree_remapped() frees the table in case we aborted before fdinstall_remapped(). We then add a function exec_copyin_data_fds() that builds on top these functions. It copies in the data and constructs a new remapped file descriptor. This is used by cloudabi_sys_proc_exec(). Test Plan: cloudabi-run(1) is capable of spawning processes successfully, providing it data and file descriptors. procstat -f seems to confirm all is good. Regular FreeBSD processes also work properly. Reviewers: kib, mjg Reviewed By: mjg Subscribers: imp Differential Revision: https://reviews.freebsd.org/D3079
|
#
8a08cec1 |
|
11-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
Create a dedicated function for ensuring that cdir and rdir are populated. Previously several places were doing it on its own, partially incorrectly (e.g. without the filedesc locked) or even actively harmful by populating jdir or assigning rootvnode without vrefing it. Reviewed by: kib
|
#
f0725a8e |
|
11-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
Move chdir/chroot-related fdp manipulation to kern_descrip.c Prefix exported functions with pwd_. Deduplicate some code by adding a helper for setting fd_cdir. Reviewed by: kib
|
#
9a1ad66f |
|
10-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: further cleanup of kern_dup - make mode enum start from 0 so that the assertion covers all cases [1] - rename prefix _CLOEXEC flag with _FLAG - postpone fhold on the old file descriptor, which eliminates the need to fdrop in error cases. - fixup FDDUP_FCNTL check missed in the previous commit This removes 'fp == oldfde->fde_file' assertion which had little value. kern_dup only calls fd-related functions which cannot drop the lock or a whole lot of races would be introduced. Noted by: kib [1]
|
#
5fe97c20 |
|
10-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: split kern_dup flags argument into actual flags and a mode Tidy up the code inside to switch on the mode.
|
#
2491302a |
|
09-Jul-2015 |
Ed Schouten <ed@FreeBSD.org> |
Add implementations for some of the CloudABI file descriptor system calls. All of the CloudABI system calls that operate on file descriptors of an arbitrary type are prefixed with fd_. This change adds wrappers for most of these system calls around their FreeBSD equivalents. The dup2() system call present on CloudABI deviates from POSIX, in the sense that it can only be used to replace existing file descriptor. It cannot be used to create new ones. The reason for this is that this is inherently thread-unsafe. Furthermore, there is no need on CloudABI to use fixed file descriptor numbers. File descriptors 0, 1 and 2 have no special meaning. This change exposes the kern_dup() through <sys/syscallsubr.h> and puts the FDDUP_* flags in <sys/filedesc.h>. It then adds a new flag, FDDUP_MUSTREPLACE to force that file descriptors are replaced -- not allocated. Differential Revision: https://reviews.freebsd.org/D3035 Reviewed by: mjg
|
#
efdc2530 |
|
09-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: prepare do_dup for being exported - rename it to kern_dup. - prefix flags with FD - assert that correct flags were passed
|
#
69d11def |
|
08-Jul-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
Handle copyout for the fcntl(F_OGETLK) using oflock structure. Otherwise, kernel overwrites a word past the destination. Submitted by: walter@pelissero.de PR: 196718 MFC after: 1 week
|
#
f131759f |
|
05-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: make 'rights' a manadatory argument to fget* functions
|
#
dba0bec2 |
|
04-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: de-k&r-ify functions + some whitespace fixes No functional changes.
|
#
9ef8328d |
|
16-Jun-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: make rights a mandatory argument to fget_unlocked
|
#
80f3623f |
|
16-Jun-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: don't unnecessary copy capabilities in _fget
|
#
cedab3c7 |
|
14-Jun-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: reduce excessive zeroing on fd close fde_file as NULL is already an indicator of an unused fd. All other fields are populated when fp is installed.
|
#
ea31808c |
|
14-Jun-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: move out actual fp installation to _finstall Use it in fd passing functions as the first step towards fd code cleanup.
|
#
21de5aea |
|
09-Jun-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
Fixup the build after r284215. Submitted by: Ivan Klymenko <fidaj ukr.net> [slighly modified]
|
#
f6f6d240 |
|
10-Jun-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
Implement lockless resource limits. Use the same scheme implemented to manage credentials. Code needing to look at process's credentials (as opposed to thred's) is provided with *_proc variants of relevant functions. Places which possibly had to take the proc lock anyway still use the proc pointer to access limits.
|
#
3b3eb22a |
|
10-Jun-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: remove fdesc_mtx
|
#
153cc61b |
|
10-Jun-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: use atomics to manage fd_refcnt and fd_holcnt This gets rid of fdesc_mtx.
|
#
747c0dd6 |
|
18-May-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: fix imbalanced fdp unlock in F_SETLK and F_GETLK MFC after: 3 days
|
#
4b5c9cf6 |
|
29-Apr-2015 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Add kern.racct.enable tunable and RACCT_DISABLED config option. The point of this is to be able to add RACCT (with RACCT_DISABLED) to GENERIC, to avoid having to rebuild the kernel to use rctl(8). Differential Revision: https://reviews.freebsd.org/D2369 Reviewed by: kib@ MFC after: 1 month Relnotes: yes Sponsored by: The FreeBSD Foundation
|
#
8d0a4ab2 |
|
26-Apr-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: plug an always overwritten initialization in fdalloc
|
#
90f54cbf |
|
11-Apr-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: remove filedesc argument from fdclose Just accept a thread instead. This makes it consistent with fdalloc. No functional changes.
|
#
ea926658 |
|
23-Mar-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: microoptimize fget_unlocked by getting rid of fd < 0 branch Casting fd to an unsigned type simplifies fd range coparison to mere checking if the result is bigger than the table.
|
#
1eafc078 |
|
14-Mar-2015 |
Ian Lepore <ian@FreeBSD.org> |
Set the SBUF_INCLUDENUL flag in sbuf_new_for_sysctl() so that sysctl strings returned to userland include the nulterm byte. Some uses of sbuf_new_for_sysctl() write binary data rather than strings; clear the SBUF_INCLUDENUL flag after calling sbuf_new_for_sysctl() in those cases. (Note that the sbuf code still automatically adds a nulterm byte in sbuf_finish(), but since it's not included in the length it won't get copied to userland along with the binary data.) Remove explicit adding of a nulterm byte in a couple places now that it gets done automatically by the sbuf drain code. PR: 195668
|
#
8fbda7f0 |
|
18-Feb-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: obtain a stable copy of credentials in fget_unlocked This was broken in r278930. While here tidy up fget_mmap to use fdp from local var instead of obtaining the same pointer from td.
|
#
b7a39e9e |
|
17-Feb-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: simplify fget_unlocked & friends Introduce fget_fcntl which performs appropriate checks when needed. This removes a branch from fget_unlocked. Introduce fget_mmap dealing with cap_rights_to_vmprot conversion. This removes a branch from _fget. Modify fget_unlocked to pass sequence counter to interested callers so that they can perform their own checks and make sure the result was otained from stable & current state. Reviewed by: silence on -hackers
|
#
5e7cd3ec |
|
21-Jan-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: avoid spurious copying of capabilities in fget_unlocked We obtain a stable copy and store it in local 'fde' variable. Storing another copy (based on aforementioned variable) does not serve any purpose. No functional changes.
|
#
f9051b0e |
|
21-Jan-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: return 0 from badfo_close The only potential in-tree consumer (_fdrop) special-cased it and returns 0 0 on its own instead of calling badfo_close. Remove the special case since it is not needed and very unlikely to encounter anyway. No objections from: kib
|
#
57511464 |
|
21-Jan-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: fix whitespace nits in fget and fget_read No functional changes.
|
#
c31c0579 |
|
20-Jan-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: plug a test for impossible condition in _fget
|
#
20abb66e |
|
24-Nov-2014 |
John Baldwin <jhb@FreeBSD.org> |
Properly initialize the capability rights for vnodes exported to procstat that aren't for file descriptors (cwd, jdir, tracevp, etc.). Submitted by: Mikhail <mp@lenta.ru>
|
#
0c0d16e8 |
|
22-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: plug a test for impossible condition in fgetvp_rights
|
#
eb48fbd9 |
|
13-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: fixup fdinit to lock fdp and preapare files conditinally Not all consumers providing fdp to copy from want files. Perhaps these functions should be reorganized to better express the outcome. This fixes up panics after r273895 . Reported by: markj
|
#
6e646651 |
|
13-Nov-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove the no-at variants of the kern_xx() syscall helpers. E.g., we have both kern_open() and kern_openat(); change the callers to use kern_openat(). This removes one (sometimes two) levels of indirection and consolidates arguments checks. Reviewed by: mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
0e87b36e |
|
11-Nov-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Remove SF_KQUEUE code. This code was developed at Netflix, but was not ever used. It didn't go into stable/10, neither was documented. It might be useful, but we collectively decided to remove it, rather leave it abandoned and unmaintained. It is removed in one single commit, so restoring it should be easy, if anyone wants to reopen this idea. Sponsored by: Netflix
|
#
bfda9935 |
|
06-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Add sysctl kern.proc.cwd It returns only current working directory of given process which saves a lot of overhead over kern.proc.filedesc if given proc has a lot of open fds. Submitted by: Tiwei Bie <btw mail.ustc.edu.cn> (slightly modified) X-Additional: JuniorJobs project
|
#
3ae366de |
|
06-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: avoid taking fdesc_mtx when not necessary in fddrop No functional changes.
|
#
eb6021fb |
|
06-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: just free old tables without altering the list which is freed anyway No functional changes.
|
#
324a7026 |
|
02-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: plus sys/kdb.h include which crept in with r274007
|
#
1d29258a |
|
02-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: plug unnecessary fdp NULL checks in fdescfreee and fdcopy Anything reaching these functions has fd table.
|
#
32417098 |
|
02-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: create a dedicated zone for struct filedesc0 Currently sizeof(struct filedesc0) is 1096 bytes, which means allocations from malloc use 2048 bytes. There is no easy way to shrink the structure <= 1024 an it is likely to grow in the future.
|
#
3dca54ab |
|
02-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: move freeing old tables to fdescfree They cannot be accessed by anyone and hold count only protects the structure from being freed.
|
#
3dc85312 |
|
02-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: factor out some code out of fdescfree Previously it had a huge self-contained chunk dedicated to dealing with shared tables. No functional changes.
|
#
080fdefc |
|
01-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: tidy up fdcheckstd No functional changes.
|
#
d3f3e12a |
|
01-Nov-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: lock filedesc lock in fdcloseexec only when needed
|
#
2534d8ee |
|
31-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: drop retval argument from do_dup It was almost always td_retval anyway. For the one case where it is not, preserve the old value across the call.
|
#
8a5177cc |
|
31-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: fix missed comments about fdsetugidsafety While here just note that both fdsetugidsafety and fdcheckstd take sleepable locks.
|
#
f652d856 |
|
31-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: make fdinit return with source filedesc locked and new one sized appropriately Assert FILEDESC_XLOCK_ASSERT only for already used tables in fdgrowtable. We don't have to call it with the lock held if we are just creating new filedesc. As a side note, strictly speaking processes can have fdtables with fd_lastfile = -1, but then they cannot enter fdgrowtable. Very first file descriptor they get will be 0 and the only syscall allowing to choose fd number requires an active file descriptor. Should this ever change, we can add an 'init' (or similar) parameter to fdgrowtable.
|
#
ffeb8905 |
|
31-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: iterate over fd table only once in fdcopy While here add 'fdused_init' which does not perform unnecessary work. Drop FILEDESC_LOCK_ASSERT from fdisused and rely on callers to hold it when appropriate. This function is only used with INVARIANTS. No functional changes intended.
|
#
1a0c80a3 |
|
31-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: tidy up fdfree Implement fdefree_last variant and get rid of 'last' parameter. No functional changes.
|
#
b97a758f |
|
30-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: tidy up fdcopy a little bit Test for file availability by fde_file != NULL instead of fdisused, this is consistent with similar checks later. Drop badfileops check. badfileops don't have DFLAG_PASSABLE set, so it was never reached in practice. fdiused is now only used in some KASSERTS, so ifdef it under INVARIANTS. No functional changes.
|
#
f55cf4b0 |
|
30-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: make sure to force table reload in fget_unlocked when count == 0 This is a fixup to r273843.
|
#
29c85772 |
|
29-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: microoptimize fget_unlocked by retrying obtaining reference count without restarting whole lookup Restart is only needed when fp was closed by current process, which is a much rarer event than ref/deref by some other thread.
|
#
aa77d528 |
|
29-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: get rid of atomic_load_acq_int from fget_unlocked A read barrier was necessary because fd table pointer and table size were updated separately, opening a window where fget_unlocked could read new size and old pointer. This patch puts both these fields into one dedicated structure, pointer to which is later atomically updated. As such, fget_unlocked only needs data a dependency barrier which is a noop on all supported architectures. Reviewed by: kib (previous version) MFC after: 2 weeks
|
#
58a3dcb2 |
|
22-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc assert that table size is at least 3 in fdsetugidsafety Requested by: kib
|
#
11888da8 |
|
21-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: cleanup setugidsafety a little Rename it to fdsetugidsafety for consistency with other functions. There is no need to take filedesc lock if not closing any files. The loop has to verify each file and we are guaranteed fdtable has space for at least 20 fds. As such there is no need to check fd_lastfile. While here tidy up is_unsafe.
|
#
f0188618 |
|
21-Oct-2014 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Fix multiple incorrect SYSCTL arguments in the kernel: - Wrong integer type was specified. - Wrong or missing "access" specifier. The "access" specifier sometimes included the SYSCTL type, which it should not, except for procedural SYSCTL nodes. - Logical OR where binary OR was expected. - Properly assert the "access" argument passed to all SYSCTL macros, using the CTASSERT macro. This applies to both static- and dynamically created SYSCTLs. - Properly assert the the data type for both static and dynamic SYSCTLs. In the case of static SYSCTLs we only assert that the data pointed to by the SYSCTL data pointer has the correct size, hence there is no easy way to assert types in the C language outside a C-function. - Rewrote some code which doesn't pass a constant "access" specifier when creating dynamic SYSCTL nodes, which is now a requirement. - Updated "EXAMPLES" section in SYSCTL manual page. MFC after: 3 days Sponsored by: Mellanox Technologies
|
#
966ee9f2 |
|
20-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: plug 2 write-only variables Reported by: Coverity CID: 1245745, 1245746
|
#
55056be2 |
|
14-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: plug 2 assignments to M_ZERO-ed pointers in falloc_noinstall No functional changes.
|
#
2b4a2528 |
|
05-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
filedesc: fix up breakage introduced in 272505 Include sequence counter supports incoditionally [1]. This fixes reprted build problems with e.g. nvidia driver due to missing opt_capsicum.h. Replace fishy looking sizeof with offsetof. Make fde_seq the last member in order to simplify calculations. Suggested by: kib [1] X-MFC: with 272505
|
#
57c2505e |
|
05-Oct-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
On error, sbuf_bcat() returns -1. Some callers returned this -1 to the upper layers, which interpret it as errno value, which happens to be ERESTART. The result was spurious restarts of the sysctls in loop, e.g. kern.proc.proc, instead of returning ENOMEM to caller. Convert -1 from sbuf_bcat() to ENOMEM, when returning to the callers expecting errno. In collaboration with: pho Sponsored by: The FreeBSD Foundation (kib) MFC after: 1 week
|
#
ee3fd7bb |
|
04-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Plug capability races. fp and appropriate capability lookups were not atomic, which could result in improper capabilities being checked. This could result either in protection bypass or in a spurious ENOTCAPABLE. Make fp + capability check atomic with the help of sequence counters. Reviewed by: kib MFC after: 3 weeks
|
#
0c4a09a3 |
|
26-Sep-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Make do_dup() static and move relevant macros to kern_descrip.c No functional changes.
|
#
f69261f2 |
|
25-Sep-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix fcntl(2) compat32 after r270691. The copyin and copyout of the struct flock are done in the sys_fcntl(), which mean that compat32 used direct access to userland pointers. Move code from sys_fcntl() to new wrapper, kern_fcntl_freebsd(), which performs neccessary userland memory accesses, and use it from both native and compat32 fcntl syscalls. Reported by: jhibbits Sponsored by: The FreeBSD Foundation MFC after: 3 days
|
#
9696feeb |
|
22-Sep-2014 |
John Baldwin <jhb@FreeBSD.org> |
Add a new fo_fill_kinfo fileops method to add type-specific information to struct kinfo_file. - Move the various fill_*_info() methods out of kern_descrip.c and into the various file type implementations. - Rework the support for kinfo_ofile to generate a suitable kinfo_file object for each file and then convert that to a kinfo_ofile structure rather than keeping a second, different set of code that directly manipulates type-specific file information. - Remove the shm_path() and ksem_info() layering violations. Differential Revision: https://reviews.freebsd.org/D775 Reviewed by: kib, glebius (earlier version)
|
#
2d69d0dc |
|
12-Sep-2014 |
John Baldwin <jhb@FreeBSD.org> |
Fix various issues with invalid file operations: - Add invfo_rdwr() (for read and write), invfo_ioctl(), invfo_poll(), and invfo_kqfilter() for use by file types that do not support the respective operations. Home-grown versions of invfo_poll() were universally broken (they returned an errno value, invfo_poll() uses poll_no_poll() to return an appropriate event mask). Home-grown ioctl routines also tended to return an incorrect errno (invfo_ioctl returns ENOTTY). - Use the invfo_*() functions instead of local versions for unsupported file operations. - Reorder fileops members to match the order in the structure definition to make it easier to spot missing members. - Add several missing methods to linuxfileops used by the OFED shim layer: fo_write(), fo_truncate(), fo_kqfilter(), and fo_stat(). Most of these used invfo_*(), but a dummy fo_stat() implementation was added.
|
#
0ed667f6 |
|
12-Sep-2014 |
John Baldwin <jhb@FreeBSD.org> |
Simplify vntype_to_kinfo() by returning when the desired value is found instead of breaking out of the loop and then immediately checking the loop index so that if it was broken out of the proper value can be returned. While here, use nitems().
|
#
64196a99 |
|
05-Sep-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Plug unnecessary fp assignments in kern_fcntl. No functional changes.
|
#
e86447ca |
|
26-Aug-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
- Remove socket file operations declaration from sys/file.h. - Make them static in sys_socket.c. - Provide generic invfo_truncate() instead of soo_truncate(). Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
037755fd |
|
26-Aug-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Fix up races with f_seqcount handling. It was possible that the kernel would overwrite user-supplied hint. Abuse vnode lock for this purpose. In collaboration with: kib MFC after: 1 week
|
#
a1bf8115 |
|
23-Jul-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Prepare fget_unlocked for reading fd table only once. Some capsicum functions accept fdp + fd and lookup fde based on that. Add variants which accept fde. Reviewed by: pjd MFC after: 1 week
|
#
b23c40d7 |
|
10-Jul-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Don't zero fd_nfiles during fdp destruction. Code trying to take a look has to check fd_refcnt and it is 0 by that time. This is a follow up to r268505, without this the code would leak memory for tables bigger than the default. MFC after: 1 week
|
#
e518baf8 |
|
10-Jul-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Avoid relocking filedesc lock when closing fds during fdp destruction. Don't call bzero nor fdunused from fdfree for such cases. It would do unnecessary work and complain that the lock is not taken. MFC after: 1 week
|
#
b9d32c36 |
|
27-Jun-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Make fdunshare accept only td parameter. Proc had to match the thread anyway and 2 parameters were inconsistent with the rest. MFC after: 1 week
|
#
35778d7a |
|
27-Jun-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Make sure to always clear p_fd for process getting rid of its filetable. Filetable can be shared with other processes. Previous code failed to clear the pointer for all but the last process getting rid of the table. This is mostly cosmetics. Get rid of 'This should happen earlier' comment. Clearing the pointer in this place is fine as consumers can reliably check for files availability by inspecting fd_refcnt and vnodes availabity by NULL-checking them. MFC after: 1 week
|
#
450570a5 |
|
22-Jun-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Tidy up fd-related functions called by do_execve o assert in each one that fdp is not shared o remove unnecessary NULL checks - all userspace processes have fdtables and kernel processes cannot execve o remove comments about the danger of fd_ofiles getting reallocated - fdtable is not shared and fd_ofiles could be only reallocated if new fd was about to be added, but if that was possible the code would already be buggy as setugidsafety work could be undone MFC after: 1 week
|
#
15862761 |
|
22-Jun-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Don't take filedesc lock in fdunshare(). We can read refcnt safely and only care if it is equal to 1. If it could suddenly change from 1 to something bigger the code would be buggy even in the previous form and transitions from > 1 to 1 are equally racy and harmless (we copy even though there is no need). MFC after: 1 week
|
#
adf87ab0 |
|
21-Jun-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: replace fd_nfiles with fd_lastfile where appropriate fd_lastfile is guaranteed to be the biggest open fd, so when the intent is to iterate over active fds or lookup one, there is no point in looking beyond that limit. Few places are left unpatched for now. MFC after: 1 week
|
#
0f0b852c |
|
21-Jun-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
do_dup: plug redundant adjustment of fd_lastfile By that time it was already set by fdalloc, or was there in the first place if fd is replaced. MFC after: 1 week
|
#
f2b1eaec |
|
02-May-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Request a non-exiting process in sysctl_kern_proc_{o,}filedesc This fixes a race with exit1 freeing p_textvp. Suggested by: kib MFC after: 1 week
|
#
210a5d16 |
|
03-Apr-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Garbage collect fdavail. It rarely returns an error and fdallocn handles the failure of fdalloc just fine.
|
#
f8043360 |
|
21-Mar-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Mark the following sysctls as MPSAFE: kern.file kern.proc.filedesc kern.proc.ofiledesc MFC after: 7 days
|
#
4c73e705 |
|
20-Mar-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Take filedesc lock only for reading when allocating new fdtable. Code populating the table does this already. MFC after: 1 week
|
#
4a144410 |
|
16-Mar-2014 |
Robert Watson <rwatson@FreeBSD.org> |
Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. MFC after: 3 weeks
|
#
63d8fe55 |
|
21-Feb-2014 |
Bryan Drewery <bdrewery@FreeBSD.org> |
Fix style of comment blocks. Reported by: peter Approved by: bapt (mentor, implicit) X-MFC with: r262006
|
#
1f9e8f8a |
|
21-Feb-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Fix a race between kern_proc_{o,}filedesc_out and fdescfree leading to use-after-free. fdescfree proceeds to free file pointers once fd_refcnt reaches 0, but kern_proc_{o,}filedesc_out only checked for hold count. MFC after: 3 days
|
#
70f82cfb |
|
16-Feb-2014 |
Bryan Drewery <bdrewery@FreeBSD.org> |
Fix M_FILEDESC leak in fdgrowtable() introduced in r244510. fdgrowtable() now only reallocates fd_map when necessary. This fixes fdgrowtable() to use the same logic as fdescfree() for when to free the fd_map. The logic in fdescfree() is intended to not free the initial static allocation, however the fd_map grows at a slower rate than the table does. The table is intended to hold 20 fd, but its initial map has many more slots than 20. The slot sizing causes NDSLOTS(20) through NDSLOTS(63) to be 1 which matches NDSLOTS(20), so fdescfree() was assuming that the fd_map was still the initial allocation and not freeing it. This partially reverts r244510 by reintroducing some of the logic it removed in fdgrowtable(). Reviewed by: mjg Approved by: bapt (mentor) MFC after: 2 weeks
|
#
88812f91 |
|
16-Feb-2014 |
Bryan Drewery <bdrewery@FreeBSD.org> |
Remove redundant memcpy of fd_ofiles in fdgrowtable() added in r247602 Discussed with: mjg Approved by: bapt (mentor) MFC after: 2 weeks
|
#
231a0fe8 |
|
03-Jan-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Plug a memory leak in dup2 when both old and new fd have ioctl caps. Reviewed by: pjd MFC after: 3 days
|
#
0918d4b2 |
|
03-Jan-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Don't check for fd limits in fdgrowtable_exp. Callers do that already and additional check races with process decreasing limits and can result in not growing the table at all, which is currently not handled. MFC after: 3 days
|
#
79750e3b |
|
30-Nov-2013 |
Adrian Chadd <adrian@FreeBSD.org> |
Migrate the sendfile_sync structure into a public(ish) API in preparation for extending and reusing it. The sendfile_sync wrapper is mostly just a "mbuf transaction" wrapper, used to indicate that the backing store for a group of mbufs has completed. It's only being used by sendfile for now and it's only implementing a sleep/wakeup rendezvous. However, there are other potential signaling paths (kqueue) and other potential uses (socket zero-copy write) where the same mechanism would also be useful. So, with that in mind: * extract the sendfile_sync code out into sf_sync_*() methods * teach the sf_sync_alloc method about the current config flag - it will eventually know about kqueue. * move the sendfile_sync code out of do_sendfile() - the only thing it now knows about is the sfs pointer. The guts of the sync rendezvous (setup, rendezvous/wait, free) is now done in the syscall wrapper. * .. and teach the 32-bit compat sendfile call the same. This should be a no-op. It's primarily preparation work for teaching the sendfile_sync about kqueue notification. Tested: * Peter Holm's sendfile stress / regression scripts Sponsored by: Netflix, Inc.
|
#
f2b525e6 |
|
30-Nov-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Make process descriptors standard part of the kernel. rwhod(8) already requires process descriptors to work and having PROCDESC in GENERIC seems not enough, especially that we hope to have more and more consumers in the base. MFC after: 3 days
|
#
1744fe50 |
|
09-Oct-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
When growing the file descriptor table, new larger memory chunk is allocated, but the old table is kept around to handle the case of threads still performing unlocked accesses to it. Grow the table exponentially instead of increasing its size by sizeof(long) * 8 chunks when overflowing. This mode significantly reduces the total memory use for the processes consuming large numbers of the file descriptors which open them one by one. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Approved by: re (marius)
|
#
3625bde4 |
|
09-Oct-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Reduce code duplication, introduce the getmaxfd() helper to calculate the max filedescriptor index. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Approved by: re (marius)
|
#
da9442ef |
|
26-Sep-2013 |
John-Mark Gurney <jmg@FreeBSD.org> |
it must be the last member, not might... Reviewed by: attilio Approved by: re (delphij, gjb)
|
#
57a9eeb4 |
|
25-Sep-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Avoid memory accesses reordering which can result in fget_unlocked() seeing a stale fd_ofiles table once fd_nfiles is already updated, resulting in OOB accesses. Approved by: re (kib) Sponsored by: EMC / Isilon storage division Reported and tested by: pho Reviewed by: benno
|
#
ab568de7 |
|
05-Sep-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Handle cases where capability rights are not provided. Reported by: kib
|
#
7008be5b |
|
04-Sep-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...); bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation
|
#
ca04d21d |
|
15-Aug-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Make sendfile() a method in the struct fileops. Currently only vnode backed file descriptors have this method implemented. Reviewed by: kib Sponsored by: Nginx, Inc. Sponsored by: Netflix
|
#
9e89077c |
|
30-Jun-2013 |
Mikolaj Golub <trociny@FreeBSD.org> |
Plug up the lock lock leakage when exporting to a short buffer. Reported by: Alexander Leidinger Submitted by: mjg MFC after: 1 week
|
#
07bd8bf9 |
|
28-Jun-2013 |
Mateusz Guzik <mjg@FreeBSD.org> |
Remove duplicate NULL check in kern_proc_filedesc_out. No functional changes. MFC after: 1 week
|
#
6359d169 |
|
28-Jun-2013 |
Mikolaj Golub <trociny@FreeBSD.org> |
Rework r252313: The filedesc lock may not be dropped unconditionally before exporting fd to sbuf: fd might go away during execution. While it is ok for DTYPE_VNODE and DTYPE_FIFO because the export is from a vrefed vnode here, for other types it is unsafe. Instead, drop the lock in export_fd_to_sb(), after preparing data in memory and before writing to sbuf. Spotted by: mjg Suggested by: kib Review by: kib MFC after: 1 week
|
#
bd973910 |
|
27-Jun-2013 |
Mikolaj Golub <trociny@FreeBSD.org> |
To avoid LOR, always drop the filedesc lock before exporting fd to sbuf. Reviewed by: kib MFC after: 3 days
|
#
958aa575 |
|
03-May-2013 |
John Baldwin <jhb@FreeBSD.org> |
Similar to 233760 and 236717, export some more useful info about the kernel-based POSIX semaphore descriptors to userland via procstat(1) and fstat(1): - Change sem file descriptors to track the pathname they are associated with and add a ksem_info() method to copy the path out to a caller-supplied buffer. - Use the fo_stat() method of shared memory objects and ksem_info() to export the path, mode, and value of a semaphore via struct kinfo_file. - Add a struct semstat to the libprocstat(3) interface along with a procstat_get_sem_info() to export the mode and value of a semaphore. - Teach fstat about semaphores and to display their path, mode, and value. MFC after: 2 weeks
|
#
fe52cf54 |
|
14-Apr-2013 |
Mikolaj Golub <trociny@FreeBSD.org> |
Re-factor the code to provide kern_proc_filedesc_out(), kern_proc_out(), and kern_proc_vmmap_out() functions to output process kinfo structures to sbuf, to make the code reusable. The functions are going to be used in the coredump routine to store procstat info in the core program header notes. Reviewed by: kib MFC after: 3 weeks
|
#
db8f33fd |
|
14-Apr-2013 |
Mateusz Guzik <mjg@FreeBSD.org> |
Add fdallocn function and use it when passing fds over unix socket. This gets rid of "unp_externalize fdalloc failed" panic. Reviewed by: pjd MFC after: 1 week
|
#
c9d59a63 |
|
07-Apr-2013 |
Mikolaj Golub <trociny@FreeBSD.org> |
Use pget(9) to reduce code duplication. MFC after: 1 week
|
#
5f39e565 |
|
03-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Use dedicated malloc type for filecaps-related data, so we can detect any memory leaks easier.
|
#
a6157c3d |
|
03-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Plug memory leaks in file descriptors passing.
|
#
2609222a |
|
01-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Merge Capsicum overhaul: - Capability is no longer separate descriptor type. Now every descriptor has set of its own capability rights. - The cap_new(2) system call is left, but it is no longer documented and should not be used in new code. - The new syscall cap_rights_limit(2) should be used instead of cap_new(2), which limits capability rights of the given descriptor without creating a new one. - The cap_getrights(2) syscall is renamed to cap_rights_get(2). - If CAP_IOCTL capability right is present we can further reduce allowed ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed ioctls can be retrived with cap_ioctls_get(2) syscall. - If CAP_FCNTL capability right is present we can further reduce fcntls that can be used with the new cap_fcntls_limit(2) syscall and retrive them with cap_fcntls_get(2). - To support ioctl and fcntl white-listing the filedesc structure was heavly modified. - The audit subsystem, kdump and procstat tools were updated to recognize new syscalls. - Capability rights were revised and eventhough I tried hard to provide backward API and ABI compatibility there are some incompatible changes that are described in detail below: CAP_CREATE old behaviour: - Allow for openat(2)+O_CREAT. - Allow for linkat(2). - Allow for symlinkat(2). CAP_CREATE new behaviour: - Allow for openat(2)+O_CREAT. Added CAP_LINKAT: - Allow for linkat(2). ABI: Reuses CAP_RMDIR bit. - Allow to be target for renameat(2). Added CAP_SYMLINKAT: - Allow for symlinkat(2). Removed CAP_DELETE. Old behaviour: - Allow for unlinkat(2) when removing non-directory object. - Allow to be source for renameat(2). Removed CAP_RMDIR. Old behaviour: - Allow for unlinkat(2) when removing directory. Added CAP_RENAMEAT: - Required for source directory for the renameat(2) syscall. Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR): - Allow for unlinkat(2) on any object. - Required if target of renameat(2) exists and will be removed by this call. Removed CAP_MAPEXEC. CAP_MMAP old behaviour: - Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and PROT_WRITE. CAP_MMAP new behaviour: - Allow for mmap(2)+PROT_NONE. Added CAP_MMAP_R: - Allow for mmap(PROT_READ). Added CAP_MMAP_W: - Allow for mmap(PROT_WRITE). Added CAP_MMAP_X: - Allow for mmap(PROT_EXEC). Added CAP_MMAP_RW: - Allow for mmap(PROT_READ | PROT_WRITE). Added CAP_MMAP_RX: - Allow for mmap(PROT_READ | PROT_EXEC). Added CAP_MMAP_WX: - Allow for mmap(PROT_WRITE | PROT_EXEC). Added CAP_MMAP_RWX: - Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC). Renamed CAP_MKDIR to CAP_MKDIRAT. Renamed CAP_MKFIFO to CAP_MKFIFOAT. Renamed CAP_MKNODE to CAP_MKNODEAT. CAP_READ old behaviour: - Allow pread(2). - Disallow read(2), readv(2) (if there is no CAP_SEEK). CAP_READ new behaviour: - Allow read(2), readv(2). - Disallow pread(2) (CAP_SEEK was also required). CAP_WRITE old behaviour: - Allow pwrite(2). - Disallow write(2), writev(2) (if there is no CAP_SEEK). CAP_WRITE new behaviour: - Allow write(2), writev(2). - Disallow pwrite(2) (CAP_SEEK was also required). Added convinient defines: #define CAP_PREAD (CAP_SEEK | CAP_READ) #define CAP_PWRITE (CAP_SEEK | CAP_WRITE) #define CAP_MMAP_R (CAP_MMAP | CAP_SEEK | CAP_READ) #define CAP_MMAP_W (CAP_MMAP | CAP_SEEK | CAP_WRITE) #define CAP_MMAP_X (CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL) #define CAP_MMAP_RW (CAP_MMAP_R | CAP_MMAP_W) #define CAP_MMAP_RX (CAP_MMAP_R | CAP_MMAP_X) #define CAP_MMAP_WX (CAP_MMAP_W | CAP_MMAP_X) #define CAP_MMAP_RWX (CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X) #define CAP_RECV CAP_READ #define CAP_SEND CAP_WRITE #define CAP_SOCK_CLIENT \ (CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \ CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN) #define CAP_SOCK_SERVER \ (CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \ CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \ CAP_SETSOCKOPT | CAP_SHUTDOWN) Added defines for backward API compatibility: #define CAP_MAPEXEC CAP_MMAP_X #define CAP_DELETE CAP_UNLINKAT #define CAP_MKDIR CAP_MKDIRAT #define CAP_RMDIR CAP_UNLINKAT #define CAP_MKFIFO CAP_MKFIFOAT #define CAP_MKNOD CAP_MKNODAT #define CAP_SOCK_ALL (CAP_SOCK_CLIENT | CAP_SOCK_SERVER) Sponsored by: The FreeBSD Foundation Reviewed by: Christoph Mallon <christoph.mallon@gmx.de> Many aspects discussed with: rwatson, benl, jonathan ABI compatibility discussed with: kib
|
#
1d59211b |
|
25-Feb-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Style. Suggested by: kib
|
#
893365e4 |
|
25-Feb-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
After r237012, the fdgrowtable() doesn't drop the filedesc lock anymore, so update a stale comment. Reviewed by: kib, keramida
|
#
4881a595 |
|
17-Feb-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Don't treat pointers as booleans.
|
#
74938cbb |
|
13-Feb-2013 |
Ian Lepore <ian@FreeBSD.org> |
Make the F_READAHEAD option to fcntl(2) work as documented: a value of zero now disables read-ahead. It used to effectively restore the system default readahead hueristic if it had been changed; a negative value now restores the default. Reviewed by: kib
|
#
a2c496eb |
|
31-Jan-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Remove label that was accidentally moved during Giant removal from VFS.
|
#
b5471c91 |
|
20-Dec-2012 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Rewrite fdgrowtable() so common mortals can actually understand what it does and how, and add comments describing the data structures and explaining how they are managed.
|
#
5050aa86 |
|
22-Oct-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
|
#
d8c1da8b |
|
27-Jul-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Add F_DUP2FD_CLOEXEC. Apparently Solaris 11 already did this. Submitted by: Jukka A. Ukkonen <jau iki fi> PR: standards/169962 MFC after: 1 week
|
#
a53cab2c |
|
21-Jul-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
(Incomplete) fixes for symbols visibility issues and style in fcntl.h. Append '__' prefix to the tag of struct oflock, and put it under BSD namespace. Structure is needed both by libc and kernel, thus cannot be hidden under #ifdef _KERNEL. Move a set of non-standard F_* and O_* constants into BSD namespace. SUSv4 explicitely allows implemenation to pollute F_* and O_* names after fcntl.h is included, but it costs us nothing to adhere to the specification if exact POSIX compliance level is requested by user code. Change some spaces after #define to tabs. Noted by and discussed with: bde MFC after: 1 week
|
#
eb3d9754 |
|
19-Jul-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove line which was accidentally kept in r238614. Submitted by: pjd Pointy hat to: kib MFC after: 1 week
|
#
49d02b13 |
|
19-Jul-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement F_DUPFD_CLOEXEC command for fcntl(2), specified by SUSv4. PR: standards/169962 Submitted by: Jukka A. Ukkonen <jau iki fi> MFC after: 1 week
|
#
4fd85c4b |
|
08-Jul-2012 |
Mateusz Guzik <mjg@FreeBSD.org> |
Follow-up commit to r238220: Pass only FEXEC (instead of FREAD|FEXEC) in fgetvp_exec. _fget has to check for !FWRITE anyway and may as well know about FREAD. Make _fget code a bit more readable by converting permission checking from if() to switch(). Assert that correct permission flags are passed. In collaboration with: kib Approved by: trasz (mentor) MFC after: 6 days X-MFC: with r238220
|
#
28a7f607 |
|
07-Jul-2012 |
Mateusz Guzik <mjg@FreeBSD.org> |
Unbreak handling of descriptors opened with O_EXEC by fexecve(2). While here return EBADF for descriptors opened for writing (previously it was ETXTBSY). Add fgetvp_exec function which performs appropriate checks. PR: kern/169651 In collaboration with: kib Approved by: trasz (mentor) MFC after: 1 week
|
#
c5c1199c |
|
02-Jul-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Extend the KPI to lock and unlock f_offset member of struct file. It now fully encapsulates all accesses to f_offset, and extends f_offset locking to other consumers that need it, in particular, to lseek() and variants of getdirentries(). Ensure that on 32bit architectures f_offset, which is 64bit quantity, always read and written under the mtxpool protection. This fixes apparently easy to trigger race when parallel lseek()s or lseek() and read/write could destroy file offset. The already broken ABI emulations, including iBCS and SysV, are not converted (yet). Tested by: pho No objections from: jhb MFC after: 3 weeks
|
#
d99e1d5f |
|
17-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Don't check for race with close on advisory unlock (there is nothing smart we can do when such a race occurs). This saves lock/unlock cycle for the filedesc lock for every advisory unlock operation. MFC after: 1 month
|
#
604a7c2f |
|
17-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Extend the comment about checking for a race with close to explain why it is done and why we don't return an error in such case. Discussed with: kib MFC after: 1 month
|
#
fd6049b1 |
|
17-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
If VOP_ADVLOCK() call or earlier checks failed don't check for a race with close, because even if we had a race there is nothing to unlock. Discussed with: kib MFC after: 1 month
|
#
cff2dcd1 |
|
15-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Revert r237073. 'td' can be NULL here. MFC after: 1 month
|
#
3cde71cb |
|
15-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
One more attempt to make prototypes formated according to style(9), which holefully recovers from the "worse than useless" state. Reported by: bde MFC after: 1 month
|
#
19a8f674 |
|
14-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Remove fdtofp() function and use fget_locked(), which works exactly the same. MFC after: 1 month
|
#
b7fc69ca |
|
14-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Assert that the filedesc lock is being held when the fdunwrap() function is called. MFC after: 1 month
|
#
1a94dc85 |
|
14-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Simplify the code by making more use of the fdtofp() function. MFC after: 1 month
|
#
215aeba9 |
|
14-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Assert that the filedesc lock is being held when fdisused() is called. - Fix white spaces. MFC after: 1 month
|
#
7aef7542 |
|
14-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Style fixes and assertions improvements. MFC after: 1 month
|
#
8d169d9f |
|
14-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Assert that the filedesc lock is not held when closef() is called. MFC after: 1 month
|
#
eb273c01 |
|
14-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Style fixes. Reported by: bde MFC after: 1 month
|
#
c7e9a659 |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Remove code duplication from fdclosexec(), which was the reason of the bug fixed in r237065. MFC after: 1 month
|
#
8f59e9fd |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
When we are closing capabilities during exec, we want to call mq_fdclose() on the underlying object and not on the capability itself. Similar bug was fixed in r236853. MFC after: 1 month
|
#
5570ae7d |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Style. MFC after: 1 month
|
#
62021672 |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
When checking if file descriptor number is valid, explicitely check for 'fd' being less than 0 instead of using cast-to-unsigned hack. Today's commit was brought to you by the letters 'B', 'D' and 'E' :)
|
#
3812dcd3 |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Allocate descriptor number in dupfdopen() itself instead of depending on the caller using finstall(). This saves us the filedesc lock/unlock cycle, fhold()/fdrop() cycle and closes a race between finstall() and dupfdopen(). MFC after: 1 month
|
#
6195bfeb |
|
13-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
There is only one caller of the dupfdopen() function, so we can simplify it a bit: - We can assert that only ENODEV and ENXIO errors are passed instead of handling other errors. - The caller always call finstall() for indx descriptor, so we can assume it is set. Actually the filedesc lock is dropped between finstall() and dupfdopen(), so there is a window there for another thread to close the indx descriptor, but it will be closed in next commit. Reviewed by: mjg MFC after: 1 month
|
#
2ca63f0a |
|
13-Jun-2012 |
Mateusz Guzik <mjg@FreeBSD.org> |
Remove 'low' argument from fd_last_used(). This function is static and the only caller always passes 0 as low. While here update note about return values in comment. Reviewed by: pjd Approved by: trasz (mentor) MFC after: 1 month
|
#
02efb9a8 |
|
13-Jun-2012 |
Mateusz Guzik <mjg@FreeBSD.org> |
Re-apply reverted parts of r236935 by pjd with some changes. If fdalloc() decides to grow fdtable it does it once and at most doubles the size. This still may be not enough for sufficiently large fd. Use fd in calculations of new size in order to fix this. When growing the table, fd is already equal to first free descriptor >= minfd, also fdgrowtable() no longer drops the filedesc lock. As a result of this there is no need to retry allocation nor lookup. Fix description of fd_first_free to note all return values. In co-operation with: pjd Approved by: trasz (mentor) MFC after: 1 month
|
#
faf0db35 |
|
12-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Revert part of the r236935 for now, until I figure out why it doesn't work properly. Reported by: davidxu
|
#
039dc89f |
|
11-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
fdgrowtable() no longer drops the filedesc lock so it is enough to retry finding free file descriptor only once after fdgrowtable(). Spotted by: pluknet MFC after: 1 month
|
#
d3ec30e5 |
|
11-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Use consistent way of checking if descriptor number is valid. MFC after: 1 month
|
#
fd45a47b |
|
11-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Be consistent with white spaces. MFC after: 1 month
|
#
19d9c0e1 |
|
11-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Remove code duplicated in kern_close() and do_dup() and use closefp() function introduced a minute ago. This code duplication was responsible for the bug fixed in r236853. Discussed with: kib Tested by: pho MFC after: 1 month
|
#
642db963 |
|
11-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Introduce closefp() function that we will be able to use to eliminate code duplication in kern_close() and do_dup(). This is committed separately from the actual removal of the duplicated code, as the combined diff was very hard to read. Discussed with: kib Tested by: pho MFC after: 1 month
|
#
129c87eb |
|
11-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Merge two ifs into one to make the code almost identical to the code in kern_close(). Discussed with: kib Tested by: pho MFC after: 1 month
|
#
d327cee2 |
|
11-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Move the code around a bit to move two parts of code duplicated from kern_close() close together. Discussed with: kib Tested by: pho MFC after: 1 month
|
#
8b407931 |
|
11-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Now that fdgrowtable() doesn't drop the filedesc lock we don't need to check if descriptor changed from under us. Replace the check with an assert. Discussed with: kib Tested by: pho MFC after: 1 month
|
#
69d76148 |
|
10-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
When we are closing capability during dup2(), we want to call mq_fdclose() on the underlying object and not on the capability itself. Discussed with: rwatson Sponsored by: FreeBSD Foundation MFC after: 1 month
|
#
1b693d74 |
|
10-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Merge two ifs into one. Other minor style fixes. MFC after: 1 month
|
#
8849ae72 |
|
10-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Simplify fdtofp(). MFC after: 1 month
|
#
e59a9736 |
|
09-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
There is no need to drop the FILEDESC lock around malloc(M_WAITOK) anymore, as we now use sx lock for filedesc structure protection. Reviewed by: kib MFC after: 1 month
|
#
68abac43 |
|
09-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Remove now unused variable. MFC after: 1 month MFC with: r236820
|
#
380513aa |
|
09-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Make some of the loops more readable. Reviewed by: tegge MFC after: 1 month
|
#
5d02ed91 |
|
08-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Correct panic message. MFC after: 1 month MFC with: r236731
|
#
bf3e37ef |
|
07-Jun-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
In fdalloc() f_ofileflags for the newly allocated descriptor has to be 0. Assert that instead of setting it to 0. Sponsored by: FreeBSD Foundation MFC after: 1 month
|
#
847d0034 |
|
11-Apr-2012 |
Eitan Adler <eadler@FreeBSD.org> |
Return EBADF instead of EMFILE from dup2 when the second argument is outside the range of valid file descriptors PR: kern/164970 Submitted by: Peter Jeremy <peterjeremy@acm.org> Reviewed by: jilles Approved by: cperciva MFC after: 1 week
|
#
e506e182 |
|
01-Apr-2012 |
John Baldwin <jhb@FreeBSD.org> |
Export some more useful info about shared memory objects to userland via procstat(1) and fstat(1): - Change shm file descriptors to track the pathname they are associated with and add a shm_path() method to copy the path out to a caller-supplied buffer. - Use the fo_stat() method of shared memory objects and shm_path() to export the path, mode, and size of a shared memory object via struct kinfo_file. - Add a struct shmstat to the libprocstat(3) interface along with a procstat_get_shm_info() to export the mode and size of a shared memory object. - Change procstat to always print out the path for a given object if it is valid. - Teach fstat about shared memory objects and to display their path, mode, and size. MFC after: 2 weeks
|
#
ffae9d4d |
|
08-Mar-2012 |
Peter Holm <pho@FreeBSD.org> |
Free up allocated memory used by posix_fadvise(2).
|
#
0e31b3c1 |
|
14-Nov-2011 |
David E. O'Brien <obrien@FreeBSD.org> |
Reformat comment to be more readable in standard Xterm. (while I'm here, wrap other long lines)
|
#
dccc45e4 |
|
03-Nov-2011 |
John Baldwin <jhb@FreeBSD.org> |
Move the cleanup of f_cdevpriv when the reference count of a devfs file descriptor drops to zero out of _fdrop() and into devfs_close_f() as it is only relevant for devfs file descriptors. Reviewed by: kib MFC after: 1 week
|
#
b160c141 |
|
11-Oct-2011 |
Robert Watson <rwatson@FreeBSD.org> |
Correct a bug in export of capability-related information from the sysctls supporting procstat -f: properly provide capability rights information to userspace. The bug resulted from a merge-o during upstreaming (or rather, a failure to properly merge FreeBSD-side changed downstream). Spotted by: des, kibab MFC after: 3 days
|
#
8451d0dd |
|
16-Sep-2011 |
Kip Macy <kmacy@FreeBSD.org> |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
|
#
cfb5f768 |
|
18-Aug-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Add experimental support for process descriptors A "process descriptor" file descriptor is used to manage processes without using the PID namespace. This is required for Capsicum's Capability Mode, where the PID namespace is unavailable. New system calls pdfork(2) and pdkill(2) offer the functional equivalents of fork(2) and kill(2). pdgetpid(2) allows querying the PID of the remote process for debugging purposes. The currently-unimplemented pdwait(2) will, in the future, allow querying rusage/exit status. In the interim, poll(2) may be used to check (and wait for) process termination. When a process is referenced by a process descriptor, it does not issue SIGCHLD to the parent, making it suitable for use in libraries---a common scenario when using library compartmentalisation from within large applications (such as web browsers). Some observers may note a similarity to Mach task ports; process descriptors provide a subset of this behaviour, but in a UNIX style. This feature is enabled by "options PROCDESC", but as with several other Capsicum kernel features, is not enabled by default in GENERIC 9.0. Reviewed by: jhb, kib Approved by: re (kib), mentor (rwatson) Sponsored by: Google Inc
|
#
9c00bb91 |
|
16-Aug-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Add the fo_chown and fo_chmod methods to struct fileops and use them to implement fchown(2) and fchmod(2) support for several file types that previously lacked it. Add MAC entries for chown/chmod done on posix shared memory and (old) in-kernel posix semaphores. Based on the submission by: glebius Reviewed by: rwatson Approved by: re (bz)
|
#
69d377fe |
|
13-Aug-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Allow Capsicum capabilities to delegate constrained access to file system subtrees to sandboxed processes. - Use of absolute paths and '..' are limited in capability mode. - Use of absolute paths and '..' are limited when looking up relative to a capability. - When a name lookup is performed, identify what operation is to be performed (such as CAP_MKDIR) as well as check for CAP_LOOKUP. With these constraints, openat() and friends are now safe in capability mode, and can then be used by code such as the capability-mode runtime linker. Approved by: re (bz), mentor (rwatson) Sponsored by: Google Inc
|
#
a9d2f8d8 |
|
10-Aug-2011 |
Robert Watson <rwatson@FreeBSD.org> |
Second-to-last commit implementing Capsicum capabilities in the FreeBSD kernel for FreeBSD 9.0: Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op. Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions. In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit. Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent. Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
|
#
c30b9b51 |
|
20-Jul-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Export capability information via sysctls. When reporting on a capability, flag the fact that it is a capability, but also unwrap to report all of the usual information about the underlying file. Approved by: re (kib), mentor (rwatson) Sponsored by: Google Inc
|
#
745bae37 |
|
15-Jul-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Add implementation for capabilities. Code to actually implement Capsicum capabilities, including fileops and kern_capwrap(), which creates a capability to wrap an existing file descriptor. We also modify kern_close() and closef() to handle capabilities. Finally, remove cap_filelist from struct capability, since we don't actually need it. Approved by: mentor (rwatson), re (Capsicum blanket) Sponsored by: Google Inc
|
#
5604e481 |
|
07-Jul-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Fix the "passability" test in fdcopy(). Rather than checking to see if a descriptor is a kqueue, check to see if its fileops flags include DFLAG_PASSABLE. At the moment, these two tests are equivalent, but this will change with the addition of capabilities that wrap kqueues but are themselves of type DTYPE_CAPABILITY. We already have the DFLAG_PASSABLE abstraction, so let's use it. This change has been tested with [the newly improved] tools/regression/kqueue. Approved by: mentor (rwatson), re (Capsicum blanket) Sponsored by: Google Inc
|
#
afcc55f3 |
|
06-Jul-2011 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
All the racct_*() calls need to happen with the proc locked. Fixing this won't happen before 9.0. This commit adds "#ifdef RACCT" around all the "PROC_LOCK(p); racct_whatever(p, ...); PROC_UNLOCK(p)" instances, in order to avoid useless locking/unlocking in kernels built without "options RACCT".
|
#
9acdfe65 |
|
05-Jul-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Rework _fget to accept capability parameters. This new version of _fget() requires new parameters: - cap_rights_t needrights the rights that we expect the capability's rights mask to include (e.g. CAP_READ if we are going to read from the file) - cap_rights_t *haverights used to return the capability's rights mask (ignored if NULL) - u_char *maxprotp the maximum mmap() rights (e.g. VM_PROT_READ) that can be permitted (only used if we are going to mmap the file; ignored if NULL) - int fget_flags FGET_GETCAP if we want to return the capability itself, rather than the underlying object which it wraps Approved by: mentor (rwatson), re (Capsicum blanket) Sponsored by: Google Inc
|
#
c0467b5e |
|
30-Jun-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
When Capsicum starts creating capabilities to wrap existing file descriptors, we will want to allocate a new descriptor without installing it in the FD array. Split falloc() into falloc_noinstall() and finstall(), and rewrite falloc() to call them with appropriate atomicity. Approved by: mentor (rwatson), re (bz)
|
#
ff6f41a4 |
|
12-May-2011 |
Stanislav Sedov <stas@FreeBSD.org> |
- Do no try to drop a NULL filedesc pointer.
|
#
0daf62d9 |
|
12-May-2011 |
Stanislav Sedov <stas@FreeBSD.org> |
- Commit work from libprocstat project. These patches add support for runtime file and processes information retrieval from the running kernel via sysctl in the form of new library, libprocstat. The library also supports KVM backend for analyzing memory crash dumps. Both procstat(1) and fstat(1) utilities have been modified to take advantage of the library (as the bonus point the fstat(1) utility no longer need superuser privileges to operate), and the procstat(1) utility is now able to display information from memory dumps as well. The newly introduced fuser(1) utility also uses this library and able to operate via sysctl and kvm backends. The library is by no means complete (e.g. KVM backend is missing vnode name resolution routines, and there're no manpages for the library itself) so I plan to improve it further. I'm commiting it so it will get wider exposure and review. We won't be able to MFC this work as it relies on changes in HEAD, which was introduced some time ago, that break kernel ABI. OTOH we may be able to merge the library with KVM backend if we really need it there. Discussed with: rwatson
|
#
722581d9 |
|
06-Apr-2011 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Add RACCT_NOFILE accounting. Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)
|
#
1fe80828 |
|
01-Apr-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
After the r219999 is merged to stable/8, rename fallocf(9) to falloc(9) and remove the falloc() version that lacks flag argument. This is done to reduce the KPI bloat. Requested by: jhb X-MFC-note: do not
|
#
246d35ec |
|
25-Mar-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Add O_CLOEXEC flag to open(2) and fhopen(2). The new function fallocf(9), that is renamed falloc(9) with added flag argument, is provided to facilitate the merge to stable branch. Reviewed by: jhb MFC after: 1 week
|
#
8e6fa660 |
|
24-Mar-2011 |
John Baldwin <jhb@FreeBSD.org> |
Fix some locking nits with the p_state field of struct proc: - Hold the proc lock while changing the state from PRS_NEW to PRS_NORMAL in fork to honor the locking requirements. While here, expand the scope of the PROC_LOCK() on the new process (p2) to avoid some LORs. Previously the code was locking the new child process (p2) after it had locked the parent process (p1). However, when locking two processes, the safe order is to lock the child first, then the parent. - Fix various places that were checking p_state against PRS_NEW without having the process locked to use PROC_LOCK(). Every place was already locking the process, just after the PRS_NEW check. - Remove or reduce the use of PROC_SLOCK() for places that were checking p_state against PRS_NEW. The PROC_LOCK() alone is sufficient for reading the current state. - Reorder fill_kinfo_proc() slightly so it only acquires PROC_SLOCK() once. MFC after: 1 week
|
#
1fb51a12 |
|
16-Feb-2011 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Mfp4 CH=177274,177280,177284-177285,177297,177324-177325 VNET socket push back: try to minimize the number of places where we have to switch vnets and narrow down the time we stay switched. Add assertions to the socket code to catch possibly unset vnets as seen in r204147. While this reduces the number of vnet recursion in some places like NFS, POSIX local sockets and some netgraph, .. recursions are impossible to fix. The current expectations are documented at the beginning of uipc_socket.c along with the other information there. Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb Tested by: zec Tested by: Mikolaj Golub (to.my.trociny gmail.com) MFC after: 2 weeks
|
#
90750179 |
|
28-Jan-2011 |
Jilles Tjoelker <jilles@FreeBSD.org> |
Do not trip a KASSERT if /dev/null cannot be opened for a setuid program. The fdcheckstd() function makes sure fds 0, 1 and 2 are open by opening /dev/null. If this fails (e.g. missing devfs or wrong permissions), fdcheckstd() will return failure and the process will exit as if it received SIGABRT. The KASSERT is only to check that kern_open() returns the expected fd, given that it succeeded. Tripping the KASSERT is most likely if fd 0 is open but fd 1 or 2 are not. MFC after: 2 weeks
|
#
23b70c1a |
|
04-Jan-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Finish r210923, 210926. Mark some devices as eternal. MFC after: 2 weeks
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
bca61398 |
|
02-May-2010 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
MFC r207116: Remove one zero from the double-0. This code doesn't have a license to kill.
|
#
2f826fdf |
|
23-Apr-2010 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Remove one zero from the double-0. This code doesn't have a license to kill. MFC after: 3 days
|
#
931d1367 |
|
07-Dec-2009 |
Xin LI <delphij@FreeBSD.org> |
MFC revision 197579 and 199617: Add two new fcntls to enable/disable read-ahead: - F_READAHEAD: specify the amount for sequential access. The amount is specified in bytes and is rounded up to nearest block size. - F_RDAHEAD: Darwin compatible version that use 128KB as the sequential access size. A third argument of zero disables the read-ahead behavior. Please note that the read-ahead amount is also constrainted by sysctl variable, vfs.read_max, which may need to be raised in order to better utilize this feature. Thanks Igor Sysoev for proposing the feature and submitting the original version, and kib@ for his valuable comments.
|
#
08013621 |
|
20-Nov-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
On the return path from F_RDAHEAD and F_READAHEAD fcntls, do not unlock Giant twice. While there, bring conditions in the do/while loops closer to style, that also makes the lines fit into 80 columns. Reported and tested by: dougb
|
#
82aebf69 |
|
28-Sep-2009 |
Xin LI <delphij@FreeBSD.org> |
Add two new fcntls to enable/disable read-ahead: - F_READAHEAD: specify the amount for sequential access. The amount is specified in bytes and is rounded up to nearest block size. - F_RDAHEAD: Darwin compatible version that use 128KB as the sequential access size. A third argument of zero disables the read-ahead behavior. Please note that the read-ahead amount is also constrainted by sysctl variable, vfs.read_max, which may need to be raised in order to better utilize this feature. Thanks Igor Sysoev for proposing the feature and submitting the original version, and kib@ for his valuable comments. Submitted by: Igor Sysoev <is rambler-co ru> Reviewed by: kib@ MFC after: 1 month
|
#
14961ba7 |
|
27-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Replace AUDIT_ARG() with variable argument macros with a set more more specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis tools (such as fxr). In MFC, we should leave the existing AUDIT_ARG() macros as they may be used by third-party kernel modules. Suggested by: brooks Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 week
|
#
2dd43eca |
|
24-Jun-2009 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Similar to the previous commit, but for CURRENT: Fix a bug where a FIFO vnode use count was increased twice, but only decreased once.
|
#
7083246d |
|
24-Jun-2009 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Fix a bug where a FIFO vnode use count was increased twice, but only decreased once. MFC after: 1 week
|
#
c4f16b69 |
|
15-Jun-2009 |
John Baldwin <jhb@FreeBSD.org> |
Add a new 'void closefrom(int lowfd)' system call. When called, it closes any open file descriptors >= 'lowfd'. It is largely identical to the same function on other operating systems such as Solaris, DFly, NetBSD, and OpenBSD. One difference from other *BSD is that this closefrom() does not fail with any errors. In practice, while the manpages for NetBSD and OpenBSD claim that they return EINTR, they ignore internal errors from close() and never return EINTR. DFly does return EINTR, but for the common use case (closing fd's prior to execve()), the caller really wants all fd's closed and returning EINTR just forces callers to call closefrom() in a loop until it stops failing. Note that this implementation of closefrom(2) does not make any effort to resolve userland races with open(2) in other threads. As such, it is not multithread safe. Submitted by: rwatson (initial version) Reviewed by: rwatson MFC after: 2 weeks
|
#
f4471727 |
|
02-Jun-2009 |
Jeff Roberson <jeff@FreeBSD.org> |
- Use an acquire barrier to increment f_count in fget_unlocked and remove the volatile cast. Describe the reason in detail in a comment. Discussed with: bde, jhb
|
#
0304c731 |
|
27-May-2009 |
Jamie Gritton <jamie@FreeBSD.org> |
Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)
|
#
6ca33ea3 |
|
20-May-2009 |
John Baldwin <jhb@FreeBSD.org> |
Set the umask in a new file descriptor table earlier in fdcopy() to remove two lock operations.
|
#
6b72d8db |
|
15-May-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Revert r192094. The revision caused problems for sysctl(3) consumers that expect that oldlen is filled with required buffer length even when supplied buffer is too short and returned error is ENOMEM. Redo the fix for kern.proc.filedesc, by reverting the req->oldidx when remaining buffer space is too short for the current kinfo_file structure. Also, only ignore ENOMEM. We have to convert ENOMEM to no error condition to keep existing interface for the sysctl, though. Reported by: ed, Florian Smeets <flo kasimir com> Tested by: pho
|
#
bf422e5f |
|
13-May-2009 |
Jeff Roberson <jeff@FreeBSD.org> |
- Implement a lockless file descriptor lookup algorithm in fget_unlocked(). - Save old file descriptor tables created on expansion until the entire descriptor table is freed so that pointers may be followed without regard for expanders. - Mark the file zone as NOFREE so we may attempt to reference potentially freed files. - Convert several fget_locked() users to fget_unlocked(). This requires us to manage reference counts explicitly but reduces locking overhead in the common case.
|
#
3f11530b |
|
15-Apr-2009 |
John Baldwin <jhb@FreeBSD.org> |
Update comment above _fget() for earlier change to FWRITE failures return EBADF rather than EINVAL. Submitted by: Jaakko Heinonen jh saunalahti fi MFC after: 1 month
|
#
06186300 |
|
14-Feb-2009 |
Joe Marcus Clarke <marcus@FreeBSD.org> |
Remove the printf's when the vnode to be exported for procstat is not a VDIR. If the file system backing a process' cwd is removed, and procstat -f PID is called, then these messages would have been printed. The extra verbosity is not required in this situation. Requested by: kib Approved by: kib
|
#
03fd9c20 |
|
14-Feb-2009 |
Joe Marcus Clarke <marcus@FreeBSD.org> |
Change two KASSERTS to printfs and simple returns. Stress testing has revealed that a process' current working directory can be VBAD if the directory is removed. This can trigger a panic when procstat -f PID is run. Tested by: pho Discovered by: phobot Reviewed by: kib Approved by: kib
|
#
54fffe2d |
|
11-Feb-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Modify fdcopy() so that, during fork(2), it won't copy file descriptors from the parent to the child process if they have an operation vector of &badfileops. This narrows a set of races involving system calls that allocate a new file descriptor, potentially block for some extended period, and then return the file descriptor, when invoked by a threaded program that concurrently invokes fork(2). Similar approches are used in both Solaris and Linux, and the wideness of this race was introduced in FreeBSD when we moved to a more optimistic implementation of accept(2) in order to simplify locking. A small race necessarily remains because the fork(2) might occur after the finit() in accept(2) but before the system call has returned, but that appears unavoidable using current APIs. However, this race is vastly narrower. The fix can be validated using the newfileops_on_fork regression test. PR: kern/130348 Reported by: Ivan Shcheklein <shcheklein at gmail dot com> Reviewed by: jhb, kib MFC after: 1 week
|
#
7efa697d |
|
29-Dec-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Clear the pointers to the file in the struct filedesc before file is closed in fdfree. Otherwise, sysctl_kern_proc_filedesc may dereference stale struct file * values. Reported and tested by: pho MFC after: 1 month
|
#
43151ee6 |
|
01-Dec-2008 |
Peter Wemm <peter@FreeBSD.org> |
Merge user/peter/kinfo branch as of r185547 into head. This changes struct kinfo_filedesc and kinfo_vmentry such that they are same on both 32 and 64 bit platforms like i386/amd64 and won't require sysctl wrapping. Two new OIDs are assigned. The old ones are available under COMPAT_FREEBSD7 - but it isn't that simple. The superceded interface was never actually released on 7.x. The other main change is to pack the data passed to userland via the sysctl. kf_structsize and kve_structsize are reduced for the copyout. If you have a process with 100,000+ sockets open, the unpacked records require a 132MB+ copyout. With packing, it is "only" ~35MB. (Still seriously unpleasant, but not quite as devastating). A similar problem exists for the vmentry structure - have lots and lots of shared libraries and small mmaps and its copyout gets expensive too. My immediate problem is valgrind. It traditionally achieves this functionality by parsing procfs output, in a packed format. Secondly, when tracing 32 bit binaries on amd64 under valgrind, it uses a cross compiled 32 bit binary which ran directly into the differing data structures in 32 vs 64 bit mode. (valgrind uses this to track file descriptor operations and this therefore affected every single 32 bit binary) I've added two utility functions to libutil to unpack the structures into a fixed record length and to make it a little more convenient to use.
|
#
2ff47c5f |
|
04-Nov-2008 |
John Baldwin <jhb@FreeBSD.org> |
Remove unnecessary locking around vn_fullpath(). The vnode lock for the vnode in question does not need to be held. All the data structures used during the name lookup are protected by the global name cache lock. Instead, the caller merely needs to ensure a reference is held on the vnode (such as vhold()) to keep it from being freed. In the case of procfs' <pid>/file entry, grab the process lock while we gain a new reference (via vhold()) on p_textvp to fully close races with execve(2). For the kern.proc.vmmap sysctl handler, use a shared vnode lock around the call to VOP_GETATTR() rather than an exclusive lock. MFC after: 1 month
|
#
21fc02d2 |
|
03-Nov-2008 |
John Baldwin <jhb@FreeBSD.org> |
Use shared vnode locks instead of exclusive vnode locks for the access(), chdir(), chroot(), eaccess(), fpathconf(), fstat(), fstatfs(), lseek() (when figuring out the current size of the file in the SEEK_END case), pathconf(), readlink(), and statfs() system calls. Submitted by: ups (mostly) Tested by: pho MFC after: 1 month
|
#
e11e3f18 |
|
23-Oct-2008 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Fix a number of style issues in the MALLOC / FREE commit. I've tried to be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.
|
#
1ede983c |
|
23-Oct-2008 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months
|
#
d7f03759 |
|
19-Oct-2008 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Import the HEAD csup code which is the basis for the cvsmode work.
|
#
ac2456bf |
|
12-Oct-2008 |
Robert Watson <rwatson@FreeBSD.org> |
Downgrade XXX to a Note for fgetsock() and fputsock(). MFC after: 3 days
|
#
bc093719 |
|
20-Aug-2008 |
Ed Schouten <ed@FreeBSD.org> |
Integrate the new MPSAFE TTY layer to the FreeBSD operating system. The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan
|
#
79da190c |
|
08-Aug-2008 |
Ed Schouten <ed@FreeBSD.org> |
Remove unneeded D_NEEDGIANT from /dev/fd/{0,1,2}. There is no reason the fdopen() routine needs Giant. It only sets curthread->td_dupfd, based on the device unit number of the cdev. I guess we won't get massive performance improvements here, but still, I assume we eventually want to get rid of Giant.
|
#
6bc1e9cd |
|
26-Jun-2008 |
John Baldwin <jhb@FreeBSD.org> |
Rework the lifetime management of the kernel implementation of POSIX semaphores. Specifically, semaphores are now represented as new file descriptor type that is set to close on exec. This removes the need for all of the manual process reference counting (and fork, exec, and exit event handlers) as the normal file descriptor operations handle all of that for us nicely. It is also suggested as one possible implementation in the spec and at least one other OS (OS X) uses this approach. Some bugs that were fixed as a result include: - References to a named semaphore whose name is removed still work after the sem_unlink() operation. Prior to this patch, if a semaphore's name was removed, valid handles from sem_open() would get EINVAL errors from sem_getvalue(), sem_post(), etc. This fixes that. - Unnamed semaphores created with sem_init() were not cleaned up when a process exited or exec'd. They were only cleaned up if the process did an explicit sem_destroy(). This could result in a leak of semaphore objects that could never be cleaned up. - On the other hand, if another process guessed the id (kernel pointer to 'struct ksem' of an unnamed semaphore (created via sem_init)) and had write access to the semaphore based on UID/GID checks, then that other process could manipulate the semaphore via sem_destroy(), sem_post(), sem_wait(), etc. - As part of the permission check (UID/GID), the umask of the proces creating the semaphore was not honored. Thus if your umask denied group read/write access but the explicit mode in the sem_init() call allowed it, the semaphore would be readable/writable by other users in the same group, for example. This includes access via the previous bug. - If the module refused to unload because there were active semaphores, then it might have deregistered one or more of the semaphore system calls before it noticed that there was a problem. I'm not sure if this actually happened as the order that modules are discovered by the kernel linker depends on how the actual .ko file is linked. One can make the order deterministic by using a single module with a mod_event handler that explicitly registers syscalls (and deregisters during unload after any checks). This also fixes a race where even if the sem_module unloaded first it would have destroyed locks that the syscalls might be trying to access if they are still executing when they are unloaded. XXX: By the way, deregistering system calls doesn't do any blocking to drain any threads from the calls. - Some minor fixes to errno values on error. For example, sem_init() isn't documented to return ENFILE or EMFILE if we run out of semaphores the way that sem_open() can. Instead, it should return ENOSPC in that case. Other changes: - Kernel semaphores now use a hash table to manage the namespace of named semaphores nearly in a similar fashion to the POSIX shared memory object file descriptors. Kernel semaphores can now also have names longer than 14 chars (up to MAXPATHLEN) and can include subdirectories in their pathname. - The UID/GID permission checks for access to a named semaphore are now done via vaccess() rather than a home-rolled set of checks. - Now that kernel semaphores have an associated file object, the various MAC checks for POSIX semaphores accept both a file credential and an active credential. There is also a new posixsem_check_stat() since it is possible to fstat() a semaphore file descriptor. - A small set of regression tests (using the ksem API directly) is present in src/tools/regression/posixsem. Reported by: kris (1) Tested by: kris Reviewed by: rwatson (lightly) MFC after: 1 month
|
#
cc8945d2 |
|
28-May-2008 |
Ed Schouten <ed@FreeBSD.org> |
Remove redundant checks from fcntl()'s F_DUPFD. Right now we perform some of the checks inside the fcntl()'s F_DUPFD operation twice. We first validate the `fd' argument. When finished, we validate the `arg' argument. These checks are also performed inside do_dup(). The reason we need to do this, is because fcntl() should return different errno's when the `arg' argument is out of bounds (EINVAL instead of EBADF). To prevent the redundant locking of the PROC_LOCK and FILEDESC_SLOCK, patch do_dup() to support the error semantics required by fcntl(). Approved by: philip (mentor)
|
#
258f4727 |
|
25-May-2008 |
Attilio Rao <attilio@FreeBSD.org> |
Replace direct atomic operation for the file refcount witht the refcount interface. It also introduces the correct usage of memory barriers, as sometimes fdrop() and fhold() are used with shared locks, which don't use any release barrier.
|
#
82f4d640 |
|
21-May-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement the per-open file data for the cdev. The patch does not change the cdevsw KBI. Management of the data is provided by the functions int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr); int devfs_get_cdevpriv(void **datap); void devfs_clear_cdevpriv(void); All of the functions are supposed to be called from the cdevsw method contexts. - devfs_set_cdevpriv assigns the priv as private data for the file descriptor which is used to initiate currently performed driver operation. dtr is the function that will be called when either the last refernce to the file goes away, the device is destroyed or devfs_clear_cdevpriv is called. - devfs_get_cdevpriv is the obvious accessor. - devfs_clear_cdevpriv allows to clear the private data for the still open file. Implementation keeps the driver-supplied pointers in the struct cdev_privdata, that is referenced both from the struct file and struct cdev, and cannot outlive any of the referee. Man pages will be provided after the KPI stabilizes. Reviewed by: jhb Useful suggestions from: jeff, antoine Debugging help and tested by: pho MFC after: 1 month
|
#
5894445d |
|
26-Apr-2008 |
Kris Kennaway <kris@FreeBSD.org> |
* Correct a mis-merge that leaked the PROC_LOCK [1] * Return ENOENT on error instead of 0 [2] Submitted by: rdivacky [1], kib [2]
|
#
b1ba81d9 |
|
24-Apr-2008 |
Kris Kennaway <kris@FreeBSD.org> |
fdhold can return NULL, so add the one remaining missing check for this condition. Reviewed by: attilio MFC after: 1 week
|
#
dfdcada3 |
|
26-Mar-2008 |
Doug Rabson <dfr@FreeBSD.org> |
Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks
|
#
073d8ba4 |
|
19-Mar-2008 |
Maxim Sobolev <sobomax@FreeBSD.org> |
Revert previous change - it appears that the limit I was hitting was a maxsockets limit, not maxfiles limit. The question remains why those limits are handled differently (with error code for maxfiles but with sleep for maxsokets), but those would be addressed in a separate commit if necessary. Requested by: rwhatson, jeff
|
#
237fdd78 |
|
16-Mar-2008 |
Robert Watson <rwatson@FreeBSD.org> |
In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
|
#
c9370ff4 |
|
16-Mar-2008 |
Maxim Sobolev <sobomax@FreeBSD.org> |
Properly set size of the file_zone to match kern.maxfiles parameter. Otherwise the parameter is no-op, since zone by default limits number of descriptors to some 12K entries. Attempt to allocate more ends up sleeping on zonelimit. MFC after: 2 weeks
|
#
e3ad7f66 |
|
08-Mar-2008 |
Antoine Brodin <antoine@FreeBSD.org> |
Introduce a new F_DUP2FD command to fcntl(2), for compatibility with Solaris and AIX. fcntl(fd, F_DUP2FD, arg) and dup2(fd, arg) are functionnaly equivalent. Document it. Add some regression tests (identical to the dup2(2) regression tests). PR: 120233 Submitted by: Jukka Ukkonen Approved by: rwaston (mentor) MFC after: 1 month
|
#
60e15db9 |
|
22-Feb-2008 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
This patch adds a new ktrace(2) record type, KTR_STRUCT, whose payload consists of the null-terminated name and the contents of any structure you wish to record. A new ktrstruct() function constructs and emits a KTR_STRUCT record. It is accompanied by convenience macros for struct stat and struct sockaddr. In kdump(1), KTR_STRUCT records are handled by a dispatcher function that runs stringent sanity checks on its contents before handing it over to individual decoding funtions for each type of structure. Currently supported structures are struct stat and struct sockaddr for the AF_INET, AF_INET6 and AF_UNIX families; support for AF_APPLETALK and AF_IPX is present but disabled, as I am unable to test it properly. Since 's' was already taken, the letter 't' is used by ktrace(1) to enable KTR_STRUCT trace points, and in kdump(1) to enable their decoding. Derived from patches by Andrew Li <andrew2.li@citi.com>. PR: kern/117836 MFC after: 3 weeks
|
#
1b708999 |
|
14-Feb-2008 |
Simon L. B. Nielsen <simon@FreeBSD.org> |
Fix sendfile(2) write-only file permission bypass. Security: FreeBSD-SA-08:03.sendfile Submitted by: kib
|
#
f2805949 |
|
08-Feb-2008 |
Joe Marcus Clarke <marcus@FreeBSD.org> |
Add support for displaying a process' current working directory, root directory, and jail directory within procstat. While this functionality is available already in fstat, encapsulating it in the kern.proc.filedesc sysctl makes it accessible without using kvm and thus without needing elevated permissions. The new procstat output looks like: PID COMM FD T V FLAGS REF OFFSET PRO NAME 76792 tcsh cwd v d -------- - - - /usr/src 76792 tcsh root v d -------- - - - / 76792 tcsh 15 v c rw------ 16 9130 - - 76792 tcsh 16 v c rw------ 16 9130 - - 76792 tcsh 17 v c rw------ 16 9130 - - 76792 tcsh 18 v c rw------ 16 9130 - - 76792 tcsh 19 v c rw------ 16 9130 - - I am also bumping __FreeBSD_version for this as this new feature will be used in at least one port. Reviewed by: rwatson Approved by: rwatson
|
#
07dd4a31 |
|
20-Jan-2008 |
Robert Watson <rwatson@FreeBSD.org> |
Export a type for POSIX SHM file descriptors via kern.proc.filedesc as used by procstat, or SHM descriptors will show up as type unknown in userspace.
|
#
22db15c0 |
|
13-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
#
cb05b60a |
|
09-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
#
8e38aeff |
|
08-Jan-2008 |
John Baldwin <jhb@FreeBSD.org> |
Add a new file descriptor type for IPC shared memory objects and use it to implement shm_open(2) and shm_unlink(2) in the kernel: - Each shared memory file descriptor is associated with a swap-backed vm object which provides the backing store. Each descriptor starts off with a size of zero, but the size can be altered via ftruncate(2). The shared memory file descriptors also support fstat(2). read(2), write(2), ioctl(2), select(2), poll(2), and kevent(2) are not supported on shared memory file descriptors. - shm_open(2) and shm_unlink(2) are now implemented as system calls that manage shared memory file descriptors. The virtual namespace that maps pathnames to shared memory file descriptors is implemented as a hash table where the hash key is generated via the 32-bit Fowler/Noll/Vo hash of the pathname. - As an extension, the constant 'SHM_ANON' may be specified in place of the path argument to shm_open(2). In this case, an unnamed shared memory file descriptor will be created similar to the IPC_PRIVATE key for shmget(2). Note that the shared memory object can still be shared among processes by sharing the file descriptor via fork(2) or sendmsg(2), but it is unnamed. This effectively serves to implement the getmemfd() idea bandied about the lists several times over the years. - The backing store for shared memory file descriptors are garbage collected when they are not referenced by any open file descriptors or the shm_open(2) virtual namespace. Submitted by: dillon, peter (previous versions) Submitted by: rwatson (I based this on his version) Reviewed by: alc (suggested converting getmemfd() to shm_open())
|
#
e4650294 |
|
07-Jan-2008 |
John Baldwin <jhb@FreeBSD.org> |
Make ftruncate a 'struct file' operation rather than a vnode operation. This makes it possible to support ftruncate() on non-vnode file types in the future. - 'struct fileops' grows a 'fo_truncate' method to handle an ftruncate() on a given file descriptor. - ftruncate() moves to kern/sys_generic.c and now just fetches a file object and invokes fo_truncate(). - The vnode-specific portions of ftruncate() move to vn_truncate() in vfs_vnops.c which implements fo_truncate() for vnode file types. - Non-vnode file types return EINVAL in their fo_truncate() method. Submitted by: rwatson
|
#
a57decdf |
|
02-Jan-2008 |
Jeff Roberson <jeff@FreeBSD.org> |
- In sysctl_kern_file skip fdps with negative lastfiles. This can happen if there are no files open. Accounting for these can eventually return a negative value for olenp causing sysctl to crash with a bad malloc. Reported by: Pawel Worach <pawel.worach@gmail.com>
|
#
397c19d1 |
|
29-Dec-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove explicit locking of struct file. - Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them. Tested by: kris, pho
|
#
cc43c38c |
|
02-Dec-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Add two new sysctls in support of the forthcoming procstat(1) to support its -f and -v arguments: kern.proc.filedesc - dump file descriptor information for a process, if debugging is permitted, including socket addresses, open flags, file offsets, file paths, etc. kern.proc.vmmap - dump virtual memory mapping information for a process, if debugging is permitted, including layout and information on underlying objects, such as the type of object and path. These provide a superset of the information historically available through the now-deprecated procfs(4), and are intended to be exported in an ABI-robust form.
|
#
0bf686c1 |
|
06-Aug-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)
|
#
f6c1ecca |
|
03-Jul-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
- Use explicit locking in the various fcntl case statements so that we can acquire shared filedescriptor locks in the appropriate cases. - Remove Giant from calls that issue ioctls. The ioctl path has been mpsafe for some time now. - Only acquire giant for VOP_ADVLOCK when the filesystem requires giant. advlock is now mpsafe. Reviewed by: rwatson Approved by: re
|
#
7251b786 |
|
16-Jun-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Rather than passing SUSER_RUID into priv_check_cred() to specify when a privilege is checked against the real uid rather than the effective uid, instead decide which uid to use in priv_check_cred() based on the privilege passed in. We use the real uid for PRIV_MAXFILES, PRIV_MAXPROC, and PRIV_PROC_LIMIT. Remove the definition of SUSER_RUID; there are now no flags defined for priv_check_cred(). Obtained from: TrustedBSD Project
|
#
9e223287 |
|
31-May-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)
|
#
5c76452f |
|
04-May-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
Mark the filedescriptor table entries with VOP_OPEN being performed for them as UF_OPENING. Disable closing of that entries. This should fix the crashes caused by devfs_open() (and fifo_open()) dereferencing struct file * by index, while the filedescriptor is closed by parallel thread. Idea by: tegge Reviewed by: tegge (previous version of patch) Tested by: Peter Holm Approved by: re (kensmith) MFC after: 3 weeks
|
#
06e043fb |
|
26-Apr-2007 |
John Baldwin <jhb@FreeBSD.org> |
Avoid a lot of code duplication by using kern_open() to open /dev/null in fdcheckstd() instead of a stripped down version of kern_open()'s code. MFC after: 1 week Reviewed by: cperciva
|
#
5e3f7694 |
|
04-Apr-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff
|
#
3076ca67 |
|
15-Mar-2007 |
John Baldwin <jhb@FreeBSD.org> |
Just use 'fdrop()' instead of 'FILE_LOCK(); fdrop_locked()' in dupfdopen(). While I'm at it, move the second fdrop() out from under the filedesc lock.
|
#
873fbcd7 |
|
05-Mar-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Further system call comment cleanup: - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments.
|
#
0c14ff0e |
|
04-Mar-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Remove 'MPSAFE' annotations from the comments above most system calls: all system calls now enter without Giant held, and then in some cases, acquire Giant explicitly. Remove a number of other MPSAFE annotations in the credential code and tweak one or two other adjacent comments.
|
#
780a98ad |
|
15-Feb-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Catch up file descriptor printing function in DDB to the addition of kqueues and POSIX message queues.
|
#
442f65e9 |
|
15-Feb-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Break file descriptor printing logic out of db_show_files() into db_print_file(), and add a new "show file <ptr>" DDB command, which can be used to print out file descriptors referenced in stack traces.
|
#
4f506694 |
|
17-Jan-2007 |
Xin LI <delphij@FreeBSD.org> |
Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.
|
#
9ae328fc |
|
05-Jan-2007 |
John Baldwin <jhb@FreeBSD.org> |
- Close a race between enumerating UNIX domain socket pcb structures via sysctl and socket teardown by adding a reference count to the UNIX domain pcb object and fixing the sysctl that enumerates unpcbs to grab a reference on each unpcb while it builds the list to copy out to userland. - Close a race between UNIX domain pcb garbage collection (unp_gc()) and file descriptor teardown (fdrop()) by adding a new garbage collection flag FWAIT. unp_gc() sets FWAIT while it walks the message buffers in a UNIX domain socket looking for nested file descriptor references and clears the flag when it is finished. fdrop() checks to see if the flag is set on a file descriptor whose refcount just dropped to 0 and waits for unp_gc() to clear the flag before completely destroying the file descriptor. MFC after: 1 week Reviewed by: rwatson Submitted by: ups Hopefully makes the panics go away: mx1
|
#
acd3428b |
|
06-Nov-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
#
aeab19b2 |
|
23-Sep-2006 |
John-Mark Gurney <jmg@FreeBSD.org> |
return EBADF instead of successfully attaching (and then panicing) when an fd is dieing.. Convinced by: jhb PR: 103127
|
#
b04aff77 |
|
21-Jul-2006 |
John Baldwin <jhb@FreeBSD.org> |
Add a comment to explain what fdclose() does and what it's purpose is since the subtlety eluded me when I looked at it last week.
|
#
c1cccebe |
|
08-Jul-2006 |
John Baldwin <jhb@FreeBSD.org> |
Add a kern_close() so that the ABIs can close a file descriptor w/o having to populate a close_args struct and change some of the places that do.
|
#
0bd645ae |
|
27-Jun-2006 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Compress direct cr_ruid comparsion and jailed() call to suser_cred(9). Reviewed by: rwatson
|
#
197b35d7 |
|
01-Apr-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Mark fgetsock() and fputsock() as depcrecated: callers should rely on the file descriptor reference, rather than paying additional lock operations to acquire a socket reference from the file descriptor. This will also help to ensure that file descriptor based socket requests are not delivered to a socket after close. Most consumers have already been converted to this model. MFC after: 3 months
|
#
2ed4894a |
|
19-Mar-2006 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Restore fd optimization with a few minor tweaks, to quote tegge: "fdinit() fails to initialize newfdp->fd_fd.fd_lastfile to -1. This breaks fdcopy() which will incorrectly set newfdp->fd_freefile to 1 if no files are open and the last file descriptor marked as unused for fdp was 0. This later causes descriptor 0 to be unavailable in newfdp when the optimization is enabled. When the last file descriptor previously marked as used is nonzero and marked as unused, fdunused() incorrectly sets fdp->fd_lastfile to fd - 1 due to fd_last_used() returning (size - 1). This hides the problem that breaks the optimization." This allows us to keep the optimization, while un-breaking it. This is a RELENG_6 candidate. PR: kern/87208 MFC after: 1 week Submitted by: tegge
|
#
30bacc08 |
|
18-Mar-2006 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Back out fd optimization introduced in revision 1.280 as it appears to be really breaking things. Simple "close(0); dup(fd)" does not return descriptor "0" in some cases. Further, this change also breaks some MAC interactions with mac_execve_will_transition(). Under certain circumstances, fdcheckstd() can be called in execve(2) causing an assertion that checks to make sure that stdin, stdout and stderr reside at indexes 0, 1 and 2 in the process fd table to fail, resulting in a kernel panic when INVARIANTS is on. This should also kill the "dup(2) regression on 6.x" show stopper item on the 6.1-RELEASE TODO list. This is a RELENG_6 candidate. PR: kern/87208 Silence from: des MFC after: 1 week
|
#
a750d0b2 |
|
05-Feb-2006 |
Wayne Salamon <wsalamon@FreeBSD.org> |
Add auditing of arguments to the close() and fstat() system calls. Much more argument auditing yet to come, for remaining system calls in this file. Obtained from: TrustedBSD Project Approved by: rwatson (mentor)
|
#
38f63f7e |
|
06-Jan-2006 |
John Baldwin <jhb@FreeBSD.org> |
Return EBADF rather than EINVAL for FWRITE failure as per POSIX. MFC after: 1 week
|
#
b2f92ef9 |
|
29-Nov-2005 |
David Xu <davidxu@FreeBSD.org> |
Last step to make mq_notify conform to POSIX standard, If the process has successfully attached a notification request to the message queue via a queue descriptor, file closing should remove the attachment.
|
#
742be782 |
|
10-Nov-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Add the f_msgcount field to the set of struct file fields printed in show files. MFC after: 1 week
|
#
2be165c9 |
|
10-Nov-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Expanet of details printed for each file descriptor to include it's garbage collection flags. Reformat generally to make this fit and leave some room for future expansion. MFC after: 1 week
|
#
b4e507aa |
|
10-Nov-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Add a DDB "show files" command to list the current open file list, some state about each open file, and identify the first process in the process table that references the file. This is helpful in debugging leaks of file descriptors. MFC after: 1 week
|
#
f8a9ed1f |
|
09-Nov-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Fix typo in recent comment tweak. Submitted by: jkim MFC after: 1 week
|
#
923633b4 |
|
09-Nov-2005 |
Robert Watson <rwatson@FreeBSD.org> |
In closef(), remove the assumption that there is a thread associated with the file descriptor. When a file descriptor is closed as a result of garbage collecting a UNIX domain socket, the file descriptor will not have any associated thread, so the logic to identify advisory locks held by that thread is not appropriate. Check the thread for NULL to avoid this scenario. Expand an existing comment to say a bit more about this. MFC after: 1 week
|
#
68a17869 |
|
01-Nov-2005 |
John Baldwin <jhb@FreeBSD.org> |
Push down Giant into fdfree() and remove it from two of the callers. Other callers such as some rfork() cases weren't locking Giant anyway. Reviewed by: csjp MFC after: 1 week
|
#
5bb84bc8 |
|
31-Oct-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Normalize a significant number of kernel malloc type names: - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
|
#
826cf005 |
|
04-Oct-2005 |
Roman Kurakin <rik@FreeBSD.org> |
Use FILEDESC_UNLOCK(fdp) after FILE_UNLOCK(p), not before to avoid LOR. Slightly discussed on current@. LOR #055 MFC after: 14 days
|
#
d09dfa2b |
|
26-Aug-2005 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Two minor optimizations of fdalloc(): - if minfd < fd_freefile (as is most often the case, since minfd is usually 0), set it to fd_freefile. - remove a call to fd_first_free() which duplicates work already done by fdused(). This change results in a small but measurable speedup for processes with large numbers (several thousands) of open files. PR: kern/85176 Submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz> MFC after: 3 weeks
|
#
1ee6b746 |
|
24-Jun-2005 |
Dima Dorfman <dd@FreeBSD.org> |
Fix fdcheckstd to pass the file descriptor along through vn_open. When opening a device, devfs_open needs the file descriptor to install its own fileops. Failing to pass the file descriptor causes the vnode to be returned with the regular vnops, which will cause a panic on the first read or write because devfs_specops is not meant to support those operations. This bug caused a panic after exec'ing any set[ug]id program with fds 0..2 closed (i.e., if any action had to be taken by fdcheckstd, we would panic if the exec'd program ever tried to use any of those descriptors). Reviewed by: phk Approved by: re (scottl)
|
#
6de925e5 |
|
03-May-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Use NAMEI to pickup Giant if we need it in fpcheckstd().
|
#
0a11e999 |
|
08-Mar-2005 |
Giorgos Keramidas <keramida@FreeBSD.org> |
Remove redundant initialization that is repeated in the for() loop right below it. Approved by: jhb
|
#
46da8bf8 |
|
07-Mar-2005 |
Giorgos Keramidas <keramida@FreeBSD.org> |
Typo & grammar fixes in comments.
|
#
44dc16a9 |
|
09-Feb-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make some file/filedesc related functions static
|
#
76951d21 |
|
07-Feb-2005 |
John Baldwin <jhb@FreeBSD.org> |
- Tweak kern_msgctl() to return a copy of the requested message queue id structure in the struct pointed to by the 3rd argument for IPC_STAT and get rid of the 4th argument. The old way returned a pointer into the kernel array that the calling function would then access afterwards without holding the appropriate locks and doing non-lock-safe things like copyout() with the data anyways. This change removes that unsafeness and resulting race conditions as well as simplifying the interface. - Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(), fstatfs(), and fhstatfs(). Use these wrappers to cut out a lot of code duplication for freebsd4 and netbsd compatability system calls. - Add a new lookup function kern_alternate_path() that looks up a filename under an alternate prefix and determines which filename should be used. This is basically a more general version of linux_emul_convpath() that can be shared by all the ABIs thus allowing for further reduction of code duplication.
|
#
8516dd18 |
|
24-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't use VOP_GETVOBJECT, use vp->v_object directly.
|
#
66ca1b48 |
|
24-Jan-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Use VFS_LOCK_GIANT() in place of mtx_lock(&giant), etc. Sponsored By: Isilon Systems, Inc.
|
#
9454b2d8 |
|
06-Jan-2005 |
Warner Losh <imp@FreeBSD.org> |
/* -> /*- for copyright notices, minor format tweaks as necessary
|
#
662d80dc |
|
14-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix a deadlock I introduced this morning. Mostly from: tegge
|
#
d986dbb4 |
|
14-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a new kind of reference count (fd_holdcnt) to struct filedesc which holds on to just the data structure and the mutex. (The existing refcount (fd_refcnt) holds onto the open files in the descriptor.) The fd_holdcnt is protected by fdesc_mtx, fd_refcnt by FILEDESC_LOCK. Add fdhold(struct proc *) which gets a hold on the filedescriptors of the specified proc.. Add fddrop(struct filedesc *) which drops the fd_holdcnt and if zero destroys the mutex and frees the memory. Initialize the fd_holdcnt to one in fdinit(). Normal operations on the filedesc structure will not change it. In fdfree() use fddrop() to dispose of the mutex and structure. Hold the FILEDESC_LOCK() until we have cleaned out the contents and carefully set the fields to null values during cleanup. Use fdhold()/fddrop() in mountcheckdirs() and sysctl_kern_file().
|
#
30abaa53 |
|
14-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make fdesc_mtx private to kern_descrip.c now that the flock has come home.
|
#
12b18fda |
|
14-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move the checkdirs() function from vfs_mount.c to kern_descrip.c and call it mountcheckdirs().
|
#
c113083c |
|
14-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add new function fdunshare() which encapsulates the necessary light magic for ensuring that a process' filedesc is not shared with anybody. Use it in the two places which previously had private implmentations. This collects all fd_refcnt handling in kern_descrip.c
|
#
9722743b |
|
03-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Sort and wash #includes.
|
#
355be4ee |
|
01-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Drop ffree() as a separate function and incorporate the only place used.
|
#
20ddb405 |
|
02-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Style polishing. Use grepable functions Other minor nitpickings.
|
#
d672e075 |
|
01-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
We already have a lock initialization function, use that for fdesc_mtx also. Polish badfo stuff.
|
#
010b1e3f |
|
01-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Collect the stuff for the /dev/fd/{%d,std{in,out,err}} pseudo-device driver at the bottom of the file.
|
#
e4643c73 |
|
01-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
"nfiles" is a bad name for a global variable. Call it "openfiles" instead as this is more correct and matches the sysctl variable.
|
#
cc2f51ef |
|
01-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Style: move data to top of file.
|
#
1a1238a1 |
|
28-Nov-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Don't acquire Giant before calling closef() in close() (and elsewhere); instead acquire it conditionally in closef() if it is required for advisory locking. This removes Giant from the close() path of sockets and pipes (and any other objects that don't acquire Giant in their fo_close path, such as kqueues). Giant will still be acquired twice for vnodes -- once for advisory lock teardown, and a second time in the fo_close method. Both Poul-Henning and I believe that the advisory lock teardown code can be moved into the vn_closefile path shortly. This trims a percent or two off the cost of most non-vnode close operations on SMP, but has a fairly minimal impact on UP where the cost of a single mutex operation is pretty low.
|
#
f0775d7c |
|
25-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix LOR. Solution pointed out by: jhb
|
#
c17ff949 |
|
21-Nov-2004 |
David Schultz <das@FreeBSD.org> |
Neither of the arguments to closef() can be NULL anymore, so don't check for that.
|
#
dc990525 |
|
16-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move a FILEDESC_UNLOCK up to maintain correct nesting of FILEDESC/FILE locking.
|
#
970d8904 |
|
15-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make FILE_LOCK and FILEDESC_LOCK nest properly by postponing the the release of FILEDESC_LOCK a few more lines.
|
#
2e4fed7c |
|
14-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move #define up.
|
#
124e4c3b |
|
13-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST. Use this in all the places where sleeping with the lock held is not an issue. The distinction will become significant once we finalize the exact lock-type to use for this kind of case.
|
#
598b7ec8 |
|
07-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use more intuitive pointer for fdinit() and fdcopy(). Change fdcopy() to take unlocked filedesc.
|
#
ef11fbd7 |
|
07-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Introduce fdclose() which will clean an entry in a filedesc. Replace homerolled versions with call to fdclose(). Make fdunused() static to kern_descrip.c
|
#
2f5a40aa |
|
07-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move fdinit() related stuff from .h to .c
|
#
8ec21e3a |
|
06-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Allow fdinit() to be called with a NULL fdp argument so we can use it when setting up init. Make fdinit() lock the fdp argument as needed.
|
#
3b19b5af |
|
06-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
When we open /dev/null for stdin/out/err for safety reasons, do it right: we should preserve f_data and f_ops if they are already set.
|
#
81158452 |
|
18-Oct-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Push acquisition of the accept mutex out of sofree() into the caller (sorele()/sotryfree()): - This permits the caller to acquire the accept mutex before the socket mutex, avoiding sofree() having to drop the socket mutex and re-order, which could lead to races permitting more than one thread to enter sofree() after a socket is ready to be free'd. - This also covers clearing of the so_pcb weak socket reference from the protocol to the socket, preventing races in clearing and evaluation of the reference such that sofree() might be called more than once on the same socket. This appears to close a race I was able to easily trigger by repeatedly opening and resetting TCP connections to a host, in which the tcp_close() code called as a result of the RST raced with the close() of the accepted socket in the user process resulting in simultaneous attempts to de-allocate the same socket. The new locking increases the overhead for operations that may potentially free the socket, so we will want to revise the synchronization strategy here as we normalize the reference counting model for sockets. The use of the accept mutex in freeing of sockets that are not listen sockets is primarily motivated by the potential need to remove the socket from the incomplete connection queue on its parent (listen) socket, so cleaning up the reference model here may allow us to substantially weaken the synchronization requirements. RELENG_5_3 candidate. MFC after: 3 days Reviewed by: dwhite Discussed with: gnn, dwhite, green Reported by: Marc UBM Bocklet <ubm at u-boot-man dot de> Reported by: Vlad <marchenko at gmail dot com>
|
#
c233d032 |
|
04-Oct-2004 |
Julian Elischer <julian@FreeBSD.org> |
Another case where we need to guard against a partially constructed process. Submitted by: Stephan Uphoff ( ups at tree.com ) MFC after: 3 days
|
#
16239786 |
|
19-Aug-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Remove GIANT_REQUIRED from setugidsafety() as knote_fdclose() no longer requires Giant.
|
#
8912c44d |
|
15-Aug-2004 |
Brian Feldman <green@FreeBSD.org> |
Add the missing knote_fdclose().
|
#
ad3b9257 |
|
15-Aug-2004 |
John-Mark Gurney <jmg@FreeBSD.org> |
Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)
|
#
b223d064 |
|
07-Aug-2004 |
Robert Watson <rwatson@FreeBSD.org> |
We're not yet ready to assert !Giant in kern_fcntl(), as it's called with Giant from ABI wrappers such as Linux emulation. Foot shoot off: phk
|
#
a0a81974 |
|
06-Aug-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Avoid acquiring Giant for some common light-weight or already MPSAFE fcntl() operations, including: F_DUPFD dup() alias F_GETFD retrieve close-on-exec flag F_SETFD set close-on-exec flag F_GETFL retrieve file descriptor flags For the remaining fcntl() operations, do acquire Giant, especially where we call into fo_ioctl() as a result. We're not yet ready to push Giant into fo_ioctl(). Once we do, this can all become quite a bit prettier.
|
#
0be8ad5f |
|
04-Aug-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Assert Giant in the following file descriptor-related functions: Function Reason -------- ------ fdfree() VFS setugidsafety() KQueue fdcheckstd() VFS _fgetvp() VFS fgetsock() Conditional assertion based on debug.mpsafenet
|
#
a6719c82 |
|
22-Jul-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Push Giant acquisition down into fo_stat() from most callers. Acquire Giant conditional on debug.mpsafenet in the socket soo_stat() routine, unconditionally in vn_statfile() for VFS, and otherwise don't acquire Giant. Accept an unlocked read in kqueue_stat(), and cryptof_stat() is a no-op. Don't acquire Giant in fstat() system call. Note: in fdescfs, fo_stat() is called while holding Giant due to the VFS stack sitting on top, and therefore there will still be Giant recursion in this case.
|
#
1c1ce925 |
|
22-Jul-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Push acquisition of Giant from fdrop_closed() into fo_close() so that individual file object implementations can optionally acquire Giant if they require it: - soo_close(): depends on debug.mpsafenet - pipe_close(): Giant not acquired - kqueue_close(): Giant required - vn_close(): Giant required - cryptof_close(): Giant required (conservative) Notes: Giant is still acquired in close() even when closing MPSAFE objects due to kqueue requiring Giant in the calling closef() code. Microbenchmarks indicate that this removal of Giant cuts 3%-3% off of pipe create/destroy pairs from user space with SMP compiled into the kernel. The cryptodev and opencrypto code appears MPSAFE, but I'm unable to test it extensively and so have left Giant over fo_close(). It can probably be removed given some testing and review.
|
#
ed6c545c |
|
14-Jul-2004 |
Christian S.J. Peron <csjp@FreeBSD.org> |
In addition to the real user ID check, do an explicit jail check to ensure that the caller is not prison root. The intention is to fix file descriptor creation so that prison root can not use the last remaining file descriptors. This privilege should be reserved for non-jailed root users. Approved by: bmilekic (mentor)
|
#
a769355f |
|
19-Jun-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Explicitly initialize f_data and f_vnode to NULL. Report f_vnode to userland in struct xfile.
|
#
89c9c53d |
|
16-Jun-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.
|
#
395a08c9 |
|
12-Jun-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Extend coverage of SOCK_LOCK(so) to include so_count, the socket reference count: - Assert SOCK_LOCK(so) macros that directly manipulate so_count: soref(), sorele(). - Assert SOCK_LOCK(so) in macros/functions that rely on the state of so_count: sofree(), sotryfree(). - Acquire SOCK_LOCK(so) before calling these functions or macros in various contexts in the stack, both at the socket and protocol layers. - In some cases, perform soisdisconnected() before sotryfree(), as this could result in frobbing of a non-present socket if sotryfree() actually frees the socket. - Note that sofree()/sotryfree() will release the socket lock even if they don't free the socket. Submitted by: sam Sponsored by: FreeBSD Foundation Obtained from: BSD/OS
|
#
1930e303 |
|
11-Jun-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Deorbit COMPAT_SUNOS. We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither a sparc32 port nor a SunOS4.x compatibility desire these days.
|
#
63732dce |
|
01-Jun-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Push the VOP_ADVLOCK() call to release advisory locks on vnode file descriptors out of fdrop_locked() and into vn_closefile(). This removes all knowledge of vnodes from fdrop_locked(), since the lock behavior was specific to vnodes. This also removes the specific requirement for Giant in fdrop_locked(), it's now only required by code that it calls into. Add GIANT_REQUIRED to vn_closefile() since VFS requires Giant.
|
#
7f8a436f |
|
05-Apr-2004 |
Warner Losh <imp@FreeBSD.org> |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core
|
#
a1288c78 |
|
28-Mar-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Conditionally assert Giant in fputsock() based on the value of debug.mpsafenet.
|
#
47934cef |
|
25-Feb-2004 |
Don Lewis <truckman@FreeBSD.org> |
Split the mlock() kernel code into two parts, mlock(), which unpacks the syscall arguments and does the suser() permission check, and kern_mlock(), which does the resource limit checking and calls vm_map_wire(). Split munlock() in a similar way. Enable the RLIMIT_MEMLOCK checking code in kern_mlock(). Replace calls to vslock() and vsunlock() in the sysctl code with calls to kern_mlock() and kern_munlock() so that the sysctl code will obey the wired memory limits. Nuke the vslock() and vsunlock() implementations, which are no longer used. Add a member to struct sysctl_req to track the amount of memory that is wired to handle the request. Modify sysctl_wire_old_buffer() to return an error if its call to kern_mlock() fails. Only wire the minimum of the length specified in the sysctl request and the length specified in its argument list. It is recommended that sysctl handlers that use sysctl_wire_old_buffer() should specify reasonable estimates for the amount of data they want to return so that only the minimum amount of memory is wired no matter what length has been specified by the request. Modify the callers of sysctl_wire_old_buffer() to look for the error return. Modify sysctl_old_user to obey the wired buffer length and clean up its implementation. Reviewed by: bms
|
#
dc08ffec |
|
21-Feb-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Device megapatch 4/6: Introduce d_version field in struct cdevsw, this must always be initialized to D_VERSION. Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
|
#
44f4b94b |
|
16-Feb-2004 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Don't bother storing a result when all you need are the side effects.
|
#
a82294d0 |
|
15-Feb-2004 |
David Malone <dwmalone@FreeBSD.org> |
In fdcheckstd the descriptor table should never be shared, so just KASSERT this rather than trying to deal with what happens when file descriptors change out from under us.
|
#
91d5354a |
|
04-Feb-2004 |
John Baldwin <jhb@FreeBSD.org> |
Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
|
#
a6d4491c |
|
16-Jan-2004 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Restore correct semantics for F_DUPFD fcntl. This should fix the errors people have been getting with configure scripts.
|
#
56a9fc0e |
|
16-Jan-2004 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
WITNESS won't let us hold two filedesc locks at the same time, so juggle fdp and newfdp around a bit.
|
#
ddce426f |
|
16-Jan-2004 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Remove two KASSERTs which were overly paranoid.
|
#
12d568c2 |
|
15-Jan-2004 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Take care to drop locks when calling malloc()
|
#
a2fe44e8 |
|
15-Jan-2004 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
New file descriptor allocation code, derived from similar code introduced in OpenBSD by Niels Provos. The patch introduces a bitmap of allocated file descriptors which is used to locate available descriptors when a new one is needed. It also moves the task of growing the file descriptor table out of fdalloc(), reducing complexity in both fdalloc() and do_dup(). Debts of gratitude are owed to tjr@ (who provided the original patch on which this work is based), grog@ (for the gdb(4) man page) and rwatson@ (for assistance with pxeboot(8)).
|
#
c9de31f5 |
|
11-Jan-2004 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Mechanical whitespace cleanup.
|
#
0e88a717 |
|
10-Jan-2004 |
Alan Cox <alc@FreeBSD.org> |
Remove long dead code, specifically, code related to munmapfd(). (See also vm/vm_mmap.c revision 1.173.)
|
#
70ad6c21 |
|
28-Dec-2003 |
David Malone <dwmalone@FreeBSD.org> |
Plug a leak of open files that happens when you exec a suid program with one of std{in,out,err} open. This helps with the file descriptor leaks reported on -current. This should probably be merged into 5.2. Reviewed by: ru Tested by: Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net>
|
#
e1419c08 |
|
19-Oct-2003 |
David Malone <dwmalone@FreeBSD.org> |
falloc allocates a file structure and adds it to the file descriptor table, acquiring the necessary locks as it works. It usually returns two references to the new descriptor: one in the descriptor table and one via a pointer argument. As falloc releases the FILEDESC lock before returning, there is a potential for a process to close the reference in the file descriptor table before falloc's caller gets to use the file. I don't think this can happen in practice at the moment, because Giant indirectly protects closes. To stop the file being completly closed in this situation, this change makes falloc set the refcount to two when both references are returned. This makes life easier for several of falloc's callers, because the first thing they previously did was grab an extra reference on the file. Reviewed by: iedowse Idea run past: jhb
|
#
c142b0fc |
|
01-Oct-2003 |
Robert Watson <rwatson@FreeBSD.org> |
Remove the global variable 'cmask', which was used to initialize the fd_cmask field in the file descriptor structure for the first process indirectly from CMASK, and when an fd structure is initialized before being filled in, and instead just use CMASK. This appears to be an artifact left over from the initial integration of quotas into BSD. Suggested by: peter
|
#
d2cce3d6 |
|
04-Aug-2003 |
David Malone <dwmalone@FreeBSD.org> |
Do some minor Giant pushdown made possible by copyin, fget, fdrop, malloc and mbuf allocation all not requiring Giant. 1) ostat, fstat and nfstat don't need Giant until they call fo_stat. 2) accept can copyin the address length without grabbing Giant. 3) sendit doesn't need Giant, so don't bother grabbing it until kern_sendit. 4) move Giant grabbing from each indivitual recv* syscall to recvit.
|
#
fbe1bddd |
|
28-Jul-2003 |
Alan Cox <alc@FreeBSD.org> |
Revision 1.51 of vm/uma_core.c modified uma_large_free() to acquire Giant when needed. So, don't do it here.
|
#
2e4a71cd |
|
28-Jul-2003 |
Robert Watson <rwatson@FreeBSD.org> |
When exporting file descriptor data for threads invoking the kern.file sysctl, don't return information about processes that fail p_cansee(td, p). This prevents sockstat and related programs from seeing file descriptors owned by processes not in the same jail as the thread, as well as having implications for MAC, etc. This is a partial solution: it permits an information leak about the number of descriptors in the sizing calculation (but this is not new information, you can also get it from kern.openfiles), and doesn't attempt to mask file descriptors based on the properties of the descriptor, only the process referencing it. However, it provides most of what you want under most circumstances, without complicating the locking. PR: 54211 Based on a patch submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
|
#
7c89f162 |
|
27-Jul-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout.
|
#
18e8d4e7 |
|
25-Jul-2003 |
Alan Cox <alc@FreeBSD.org> |
revision 1.51 of vm/uma_core.c modified uma_large_malloc() to acquire Giant when needed.
|
#
857d9c60 |
|
12-Jul-2003 |
Don Lewis <truckman@FreeBSD.org> |
Extend the mutex pool implementation to permit the creation and use of multiple mutex pools with different options and sizes. Mutex pools can be created with either the default sleep mutexes or with spin mutexes. A dynamically created mutex pool can now be destroyed if it is no longer needed. Create two pools by default, one that matches the existing pool that uses the MTX_NOWITNESS option that should be used for building higher level locks, and a new pool with witness checking enabled. Modify the users of the existing mutex pool to use the appropriate pool in the new implementation. Reviewed by: jhb
|
#
1226914c |
|
03-Jul-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use the f_vnode field to tell which file descriptors have a vnode.
|
#
3b6d9652 |
|
22-Jun-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a f_vnode field to struct file. Several of the subtypes have an associated vnode which is used for stuff like the f*() functions. By giving the vnode a speparate field, a number of checks for the specific subtype can be replaced simply with a check for f_vnode != NULL, and we can later free f_data up to subtype specific use. At this point in time, f_data still points to the vnode, so any code I might have overlooked will still work.
|
#
eaaca5de |
|
20-Jun-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't (re)initialize f_gcflag to zero. Move initialization of DTYPE_VNODE specific field f_seqcount into the DTYPE_VNODE specific code.
|
#
bab88630 |
|
19-Jun-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Unlock the struct file lock before aquiring Giant, otherwise we can deadlock because of lock order reversals. This was not caught because Witness ignores pool mutexes right now. Diagnosis and help: truckman Noticed by: pho
|
#
4d7dfc31 |
|
18-Jun-2003 |
Mike Silbersack <silby@FreeBSD.org> |
Add a rate limited message reporting when kern.maxfiles is exceeded, reporting who did it. Also, fix a style bug introduced in the previous change. MFC after: 1 week
|
#
438f085b |
|
18-Jun-2003 |
Mike Silbersack <silby@FreeBSD.org> |
Reserve the last 5% of file descriptors for root use. This should allow systems to fail more gracefully when a file descriptor exhaustion situation occurs. Original patch by: David G. Andersen <dga@lcs.mit.edu> PR: 45353 MFC after: 1 week
|
#
7c2d2efd |
|
18-Jun-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Initialize struct fileops with C99 sparse initialization.
|
#
677b542e |
|
10-Jun-2003 |
David E. O'Brien <obrien@FreeBSD.org> |
Use __FBSDID().
|
#
ad05d580 |
|
02-Jun-2003 |
Tor Egge <tegge@FreeBSD.org> |
Add tracking of process leaders sharing a file descriptor table and allow a file descriptor table to be shared between multiple process leaders. PR: 50923
|
#
90471005 |
|
31-May-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove needless return Found by: FlexeLint
|
#
c1dca9ab |
|
15-May-2003 |
Robert Watson <rwatson@FreeBSD.org> |
VOP_PATHCONF() requires a vnode lock; this patch adds locking to fpathconf(). The lock is held for direct calls to VOP_PATHCONF() in pathconf() already. Approved by: re (jhb) Pointed out by: DEBUG_VFS_LOCKS
|
#
51da11a2 |
|
29-Apr-2003 |
Mark Murray <markm@FreeBSD.org> |
Fix some easy, global, lint warnings. In most cases, this means making some local variables static. In a couple of cases, this means removing an unused variable.
|
#
104a9b7e |
|
29-Apr-2003 |
Alexander Kabaev <kan@FreeBSD.org> |
Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h> Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
|
#
7ac40f5f |
|
02-Mar-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Gigacommit to improve device-driver source compatibility between branches: Initialize struct cdevsw using C99 sparse initializtion and remove all initializations to default values. This patch is automatically generated and has been tested by compiling LINT with all the fields in struct cdevsw in reverse order on alpha, sparc64 and i386. Approved by: re(scottl)
|
#
c6faf3bf |
|
01-Mar-2003 |
Tor Egge <tegge@FreeBSD.org> |
Remove unneeded code added in revision 1.188.
|
#
3303c14b |
|
23-Feb-2003 |
Scott Long <scottl@FreeBSD.org> |
Don't NULL out p_fd until after closefd() has been called. This isn't totally correct, but it has caused breakage for too long. I welcome someone with more fd fu to fix it correctly.
|
#
750a91d8 |
|
21-Feb-2003 |
Mike Makonnen <mtm@FreeBSD.org> |
Remove a comment which hasn't been true since rev. 1.158 Approved by: jhb, markm (mentor)(implicit)
|
#
a163d034 |
|
18-Feb-2003 |
Warner Losh <imp@FreeBSD.org> |
Back out M_* changes, per decision of the TRB. Approved by: trb
|
#
218a01e0 |
|
15-Feb-2003 |
Tor Egge <tegge@FreeBSD.org> |
Avoid file lock leakage when linuxthreads port or rfork is used: - Mark the process leader as having an advisory lock - Check if process leader is marked as having advisory lock when closing file - Check that file is still open after lock has been obtained - Don't allow file descriptor table sharing between processes with different leaders PR: 10265 Reviewed by: alfred
|
#
e7d6662f |
|
14-Feb-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Do not allow kqueues to be passed via unix domain sockets.
|
#
edf6699a |
|
14-Feb-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Fix LOR with PROC/filedesc. Introduce fdesc_mtx that will be used as a barrier between free'ing filedesc structures. Basically if you want to access another process's filedesc, you want to hold this mutex over the entire operation.
|
#
42e1b74a |
|
11-Feb-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Don't lock FILEDESC under PROC. The locking here needs to be revisited, but this ought to get rid of the LOR messages that people are complaining about for now. I imagine either I or someone else interested with smp will eventually clear this up.
|
#
4af0d0c2 |
|
29-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
NODEVFS cleanup: remove #ifdefs
|
#
a448a15b |
|
21-Jan-2003 |
Jeffrey Hsu <hsu@FreeBSD.org> |
Add missing SMP file locks around read-modify-write operations on the flag field. Reviewed by: rwatson
|
#
44956c98 |
|
21-Jan-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
#
7e760e14 |
|
19-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Originally when DEVFS was added, a global variable "devfs_present" was used to control code which were conditional on DEVFS' precense since this avoided the need for large-scale source pollution with #include "opt_geom.h" Now that we approach making DEVFS standard, replace these tests with an #ifdef to facilitate mechanical removal once DEVFS becomes non-optional. No functional change by this commit.
|
#
48e3128b |
|
12-Jan-2003 |
Matthew Dillon <dillon@FreeBSD.org> |
Bow to the whining masses and change a union back into void *. Retain removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.
|
#
cd72f218 |
|
11-Jan-2003 |
Matthew Dillon <dillon@FreeBSD.org> |
Change struct file f_data to un_data, a union of the correct struct pointer types, and remove a huge number of casts from code using it. Change struct xfile xf_data to xun_data (ABI is still compatible). If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.
|
#
f0c09328 |
|
06-Jan-2003 |
Jacques Vidrine <nectar@FreeBSD.org> |
Correct file descriptor leaks in lseek and do_dup. The leak in lseek was introduced in vfs_syscalls.c revision 1.218. The leak in do_dup was introduced in kern_descrip.c revision 1.158. Submitted by: iedowse
|
#
c522c1bf |
|
31-Dec-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
fdcopy() only needs a filedesc pointer.
|
#
03282e6e |
|
31-Dec-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
purge 'register'.
|
#
c7f1c11b |
|
31-Dec-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Since fdshare() and fdinit() only operate on filedescs, make them take pointers to filedesc structures instead of threads. This makes it more clear that they do not do any voodoo with the thread/proc or anything other than the filedesc passed in or returned. Remove some XXX KSE's as this resolves the issue.
|
#
59c97598 |
|
31-Dec-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
fdinit() does not need to lock the filedesc it is creating as no one besideds itself has access until the function returns.
|
#
f0bc12ee |
|
27-Dec-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Improve consistency between devfs and MAKEDEV: use UID_ROOT and GID_WHEEL instead of UID_BIN and GID_BIN for /dev/fd/* entries. Submitted by: kris
|
#
a7010ee2 |
|
24-Dec-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
White-space changes.
|
#
f3a68211 |
|
23-Dec-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Detediousficate declaration of fileops array members by introducing typedefs for them.
|
#
9d0fffd3 |
|
13-Dec-2002 |
Tim J. Robbins <tjr@FreeBSD.org> |
Drop filedesc lock and acquire Giant around calls to malloc() and free(). These call uma_large_malloc() and uma_large_free() which require Giant. Fixes panic when descriptor table is larger than KMEM_ZMAX bytes noticed by kkenn. Reviewed by: jhb
|
#
04f4a164 |
|
26-Nov-2002 |
John Baldwin <jhb@FreeBSD.org> |
If the file descriptors passed into do_dup() are negative, return EBADF instead of panicing. Also, perform some of the simpler sanity checks on the fds before acquiring the filedesc lock. Approved by: re Reported by: Dan Nelson <dan@emsphone.com> and others
|
#
c7047e52 |
|
27-Oct-2002 |
Garrett Wollman <wollman@FreeBSD.org> |
Change the way support for asynchronous I/O is indicated to applications to conform to 1003.1-2001. Make it possible for applications to actually tell whether or not asynchronous I/O is supported. Since FreeBSD's aio implementation works on all descriptor types, don't call down into file or vnode ops when [f]pathconf() is asked about _PC_ASYNC_IO; this avoids the need for every file and vnode op to know about it.
|
#
4562d726 |
|
18-Oct-2002 |
John Baldwin <jhb@FreeBSD.org> |
Don't lock the proc lock to clear p_fd. p_fd isn't protected by the proc lock.
|
#
bf3e55aa |
|
16-Oct-2002 |
John Baldwin <jhb@FreeBSD.org> |
Many style and whitespace fixes. Submitted by: bde (mostly)
|
#
18d9bd8f |
|
16-Oct-2002 |
John Baldwin <jhb@FreeBSD.org> |
Sort includes a bit. Submitted by: bde
|
#
7fd1f2b8 |
|
15-Oct-2002 |
John Baldwin <jhb@FreeBSD.org> |
Argh. Put back setting of P_ADVLOCK for the F_WRLCK case that was accidentally lost in the previous revision. Submitted by: bde Pointy hat to: jhb
|
#
60a6965a |
|
14-Oct-2002 |
John Baldwin <jhb@FreeBSD.org> |
Remove the leaderp variable and just access p_leader directly. The p_leader field is not protected by the proc lock but is only set during fork1() by the parent process and never changes.
|
#
91e97a82 |
|
02-Oct-2002 |
Don Lewis <truckman@FreeBSD.org> |
In an SMP environment post-Giant it is no longer safe to blindly dereference the struct sigio pointer without any locking. Change fgetown() to take a reference to the pointer instead of a copy of the pointer and call SIGIO_LOCK() before copying the pointer and dereferencing it. Reviewed by: rwatson
|
#
dde1c2c0 |
|
15-Sep-2002 |
Thomas Moestl <tmm@FreeBSD.org> |
fcntl(..., F_SETLKW, ...) takes a pointer to a struct flock just like F_SETLK does, so it also needs this structure copied in in fnctl() before calling kern_fcntl().
|
#
06be2aaa |
|
14-Sep-2002 |
Nate Lawson <njl@FreeBSD.org> |
Remove all use of vnode->v_tag, replacing with appropriate substitutes. v_tag is now const char * and should only be used for debugging. Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP. Suggested by: phk Reviewed by: bde, rwatson (earlier version)
|
#
4e115a85 |
|
13-Sep-2002 |
Thomas Moestl <tmm@FreeBSD.org> |
Fix fcntl(..., F_GETOWN, ...) and fcntl(..., F_SETOWN, ...) on sparc64 by not passing a pointer to a register_t or intptr_t when the code in the lower layers expects one to an int.
|
#
5fc30313 |
|
03-Sep-2002 |
John Baldwin <jhb@FreeBSD.org> |
- Change falloc() to acquire an fd from the process table last so that it can do it w/o needing to hold the filelist_lock sx lock. - fdalloc() doesn't need Giant to call free() anymore. It also doesn't need to drop and reacquire the filedesc lock around free() now as a result. - Try to make the code that copies fd tables when extending the fd table in fdalloc() a bit more readable by performing assignments in separate statements. This is still a bit ugly though. - Use max() instead of an if statement so to figure out the starting point in the search-for-a-free-fd loop in fdalloc() so it reads better next to the min() in the previous line. - Don't grow nfiles in steps up to the size needed if we dup2() to some really large number. Go ahead and double 'nfiles' in a loop prior to doing the malloc(). - malloc() doesn't need Giant now. - Use malloc() and free() instead of MALLOC() and FREE() in fdalloc(). - Check to see if the size we are going to grow to is too big, not if the current size of the fd table is too big in the loop in fdalloc(). This means if we are out of space or if dup2() requests too high of a fd, then we will return an error before we go off and try to allocate some huge table and copy the existing table into it. - Move all of the logic for dup'ing a file descriptor into do_dup() instead of putting some of it in do_dup() and duplicating other parts in four different places. This makes dup(), dup2(), and fcntl(F_DUPFD) basically wrappers of do_dup now. fcntl() still has an extra check since it uses a different error return value in one case then the other functions. - Add a KASSERT() for an assertion that may not always be true where the fdcheckstd() function assumes that falloc() returns the fd requested and not some other fd. I think that the assertion is always true because we are always single-threaded when we get to this point, but if one was using rfork() and another process sharing the fd table were playing with the fd table, there might could be a problem. - To handle the problem of a file descriptor we are dup()'ing being closed out from under us in dup() in general, do_dup() now obtains a reference on the file in question before calling fdalloc(). If after the call to fdalloc() the file for the fd we are dup'ing is a different file, then we drop our reference on the original file and return EBADF. This race was only handled in the dup2() case before and would just retry the operation. The error return allows the user to know they are being stupid since they have a locking bug in their app instead of dup'ing some other descriptor and returning it to them. Tested on: i386, alpha, sparc64
|
#
49c2ff15 |
|
02-Sep-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
Split fcntl() into a wrapper and a kernel-callable kern_fcntl() implementation. The wrapper is responsible for copying additional structure arguments (struct flock) to and from userland.
|
#
93b0017f |
|
25-Aug-2002 |
Philippe Charnier <charnier@FreeBSD.org> |
Replace various spelling with FALLTHROUGH which is lint()able
|
#
d49fa1ca |
|
16-Aug-2002 |
Robert Watson <rwatson@FreeBSD.org> |
In continuation of early fileop credential changes, modify fo_ioctl() to accept an 'active_cred' argument reflecting the credential of the thread initiating the ioctl operation. - Change fo_ioctl() to accept active_cred; change consumers of the fo_ioctl() interface to generally pass active_cred from td->td_ucred. - In fifofs, initialize filetmp.f_cred to ap->a_cred so that the invocations of soo_ioctl() are provided access to the calling f_cred. Pass ap->a_td->td_ucred as the active_cred, but note that this is required because we don't yet distinguish file_cred and active_cred in invoking VOP's. - Update kqueue_ioctl() for its new argument. - Update pipe_ioctl() for its new argument, pass active_cred rather than td_ucred to MAC for authorization. - Update soo_ioctl() for its new argument. - Update vn_ioctl() for its new argument, use active_cred rather than td->td_ucred to authorize VOP_IOCTL() and the associated VOP_GETATTR(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
ea6027a8 |
|
15-Aug-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Make similar changes to fo_stat() and fo_poll() as made earlier to fo_read() and fo_write(): explicitly use the cred argument to fo_poll() as "active_cred" using the passed file descriptor's f_cred reference to provide access to the file credential. Add an active_cred argument to fo_stat() so that implementers have access to the active credential as well as the file credential. Generally modify callers of fo_stat() to pass in td->td_ucred rather than fp->f_cred, which was redundantly provided via the fp argument. This set of modifications also permits threads to perform these operations on behalf of another thread without modifying their credential. Trickle this change down into fo_stat/poll() implementations: - badfo_poll(), badfo_stat(): modify/add arguments. - kqueue_poll(), kqueue_stat(): modify arguments. - pipe_poll(), pipe_stat(): modify/add arguments, pass active_cred to MAC checks rather than td->td_ucred. - soo_poll(), soo_stat(): modify/add arguments, pass fp->f_cred rather than cred to pru_sopoll() to maintain current semantics. - sopoll(): moidfy arguments. - vn_poll(), vn_statfile(): modify/add arguments, pass new arguments to vn_stat(). Pass active_cred to MAC and fp->f_cred to VOP_POLL() to maintian current semantics. - vn_close(): rename cred to file_cred to reflect reality while I'm here. - vn_stat(): Add active_cred and file_cred arguments to vn_stat() and consumers so that this distinction is maintained at the VFS as well as 'struct file' layer. Pass active_cred instead of td->td_ucred to MAC and to VOP_GETATTR() to maintain current semantics. - fifofs: modify the creation of a "filetemp" so that the file credential is properly initialized and can be used in the socket code if desired. Pass ap->a_td->td_ucred as the active credential to soo_poll(). If we teach the vnop interface about the distinction between file and active credentials, we would use the active credential here. Note that current inconsistent passing of active_cred vs. file_cred to VOP's is maintained. It's not clear why GETATTR would be authorized using active_cred while POLL would be authorized using file_cred at the file system level. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
9ca43589 |
|
15-Aug-2002 |
Robert Watson <rwatson@FreeBSD.org> |
In order to better support flexible and extensible access control, make a series of modifications to the credential arguments relating to file read and write operations to cliarfy which credential is used for what: - Change fo_read() and fo_write() to accept "active_cred" instead of "cred", and change the semantics of consumers of fo_read() and fo_write() to pass the active credential of the thread requesting an operation rather than the cached file cred. The cached file cred is still available in fo_read() and fo_write() consumers via fp->f_cred. These changes largely in sys_generic.c. For each implementation of fo_read() and fo_write(), update cred usage to reflect this change and maintain current semantics: - badfo_readwrite() unchanged - kqueue_read/write() unchanged pipe_read/write() now authorize MAC using active_cred rather than td->td_ucred - soo_read/write() unchanged - vn_read/write() now authorize MAC using active_cred but VOP_READ/WRITE() with fp->f_cred Modify vn_rdwr() to accept two credential arguments instead of a single credential: active_cred and file_cred. Use active_cred for MAC authorization, and select a credential for use in VOP_READ/WRITE() based on whether file_cred is NULL or not. If file_cred is provided, authorize the VOP using that cred, otherwise the active credential, matching current semantics. Modify current vn_rdwr() consumers to pass a file_cred if used in the context of a struct file, and to always pass active_cred. When vn_rdwr() is used without a file_cred, pass NOCRED. These changes should maintain current semantics for read/write, but avoid a redundant passing of fp->f_cred, as well as making it more clear what the origin of each credential is in file descriptor read/write operations. Follow-up commits will make similar changes to other file descriptor operations, and modify the MAC framework to pass both credentials to MAC policy modules so they can implement either semantic for revocation. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
aefe27a2 |
|
30-Jul-2002 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Have the kern.file sysctl export xfiles rather than files. The truth is out there! Sponsored by: DARPA, NAI Labs
|
#
5c38b6db |
|
28-Jul-2002 |
Don Lewis <truckman@FreeBSD.org> |
Wire the sysctl output buffer before grabbing any locks to prevent SYSCTL_OUT() from blocking while locks are held. This should only be done when it would be inconvenient to make a temporary copy of the data and defer calling SYSCTL_OUT() until after the locks are released.
|
#
3d3f20cb |
|
16-Jul-2002 |
John Baldwin <jhb@FreeBSD.org> |
Preallocate a struct file as the first thing in falloc() before we lock the filelist_lock and check nfiles. This closes a race where we had to unlock the filedesc to re-lock the filelist_lock. Reported by: David Xu Reviewed by: bde (mostly)
|
#
7f05b035 |
|
28-Jun-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
More caddr_t removal, make fo_ioctl take a void * instead of a caddr_t.
|
#
4cc20ab1 |
|
31-May-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Back out my lats commit of locking down a socket, it conflicts with hsu's work. Requested by: hsu
|
#
243917fe |
|
19-May-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Lock down a socket, milestone 1. o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred
|
#
d394511d |
|
16-May-2002 |
Tom Rhodes <trhodes@FreeBSD.org> |
More s/file system/filesystem/g
|
#
e649887b |
|
06-May-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Make funsetown() take a 'struct sigio **' so that the locking can be done internally. Ensure that no one can fsetown() to a dying process/pgrp. We need to check the process for P_WEXIT to see if it's exiting. Process groups are already safe because there is no such thing as a pgrp zombie, therefore the proctree lock completely protects the pgrp from having sigio structures associated with it after it runs funsetownlst. Add sigio lock to witness list under proctree and allproc, but over proc and pgrp. Seigo Tanimura helped with this.
|
#
6041fa0a |
|
03-May-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
As malloc(9) and free(9) are now Giant-free, remove the Giant lock across malloc(9) and free(9) of a pgrp or a session.
|
#
c8d8a686 |
|
02-May-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Fix the lock order reversal between the sigio lock and a process/pgrp lock in funsetownlst() by locking the sigio lock across funsetownlst().
|
#
f1320723 |
|
01-May-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Redo the sigio locking. Turn the sigio sx into a mutex. Sigio lock is really only needed to protect interrupts from dereferencing the sigio pointer in an object when the sigio itself is being destroyed. In order to do this in the most unintrusive manner change pgsigio's sigio * argument into a **, that way we can lock internally to the function.
|
#
1cf1a725 |
|
29-Apr-2002 |
Jeroen Ruigrok van der Werven <asmodai@FreeBSD.org> |
Fix indention which I did wrong in a previous commit. Submitted by: bde
|
#
d48d4b25 |
|
27-Apr-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Add a global sx sigio_lock to protect the pointer to the sigio object of a socket. This avoids lock order reversal caused by locking a process in pgsigio(). sowakeup() and the callers of it (sowwakeup, soisconnected, etc.) now require sigio_lock to be locked. Provide sowwakeup_locked(), soisconnected_locked(), and so on in case where we have to modify a socket and wake up a process atomically.
|
#
ea5b39d0 |
|
22-Apr-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Don't FILEDESC_LOCK around calls to falloc().
|
#
1c2451c2 |
|
19-Apr-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Push down Giant for setpgid(), setsid() and aio_daemon(). Giant protects only malloc(9) and free(9).
|
#
e983a376 |
|
18-Apr-2002 |
Jacques Vidrine <nectar@FreeBSD.org> |
When exec'ing a set[ug]id program, make sure that the stdio file descriptors (0, 1, 2) are allocated by opening /dev/null for any which are not already open. Reviewed by: alfred, phk MFC after: 2 days
|
#
ba626c1d |
|
16-Apr-2002 |
John Baldwin <jhb@FreeBSD.org> |
Lock proctree_lock instead of pgrpsess_lock.
|
#
bcbf4411 |
|
13-Apr-2002 |
Jeroen Ruigrok van der Werven <asmodai@FreeBSD.org> |
Use the correct macros for F_SETFD/F_GETFD instead of magic numbers. Reflect that fact in the manual page. PR: 12723 Submitted by: Peter Jeremy <peter.jeremy@alcatel.com.au> Approved by: bde MFC after: 2 weeks
|
#
6008862b |
|
04-Apr-2002 |
John Baldwin <jhb@FreeBSD.org> |
Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64
|
#
5cf4bceb |
|
29-Mar-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
The description of fd_mtx is "filedesc structure."
|
#
c897b813 |
|
19-Mar-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove references to vm_zone.h and switch over to the new uma API. Also, remove maxsockets. If you look carefully you'll notice that the old zone allocator never honored this anyway.
|
#
4d77a549 |
|
19-Mar-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove __P.
|
#
8355f576 |
|
19-Mar-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@
|
#
4a950215 |
|
18-Mar-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Close a race when vfs_syscalls.c:checkdirs() runs. To do this protect the filedesc pointer in the proc with PROC_LOCK in both checkdirs() and kern_descrip.c:fdfree().
|
#
628abf6c |
|
15-Mar-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Giant pushdown for read/write/pread/pwrite syscalls. kern/kern_descrip.c: Aquire Giant in fdrop_locked when file refcount hits zero, this removes the requirement for the caller to own Giant for the most part. kern/kern_ktrace.c: Aquire Giant in ktrgenio, simplifies locking in upper read/write syscalls. kern/vfs_bio.c: Aquire Giant in bwillwrite if needed. kern/sys_generic.c Giant pushdown, remove Giant for: read, pread, write and pwrite. readv and writev aren't done yet because of the possible malloc calls for iov to uio processing. kern/sys_socket.c Grab giant in the socket fo_read/write functions. kern/vfs_vnops.c Grab giant in the vnode fo_read/write functions.
|
#
a854ed98 |
|
27-Feb-2002 |
John Baldwin <jhb@FreeBSD.org> |
Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.
|
#
f591779b |
|
23-Feb-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Lock struct pgrp, session and sigio. New locks are: - pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members. Please refer to sys/proc.h for the coverage of these locks. Changes on the pgrp/session interface: - pgfind() needs the pgrpsess_lock held. - The caller of enterpgrp() is responsible to allocate a new pgrp and session. - Call enterthispgrp() in order to enter an existing pgrp. - pgsignal() requires a pgrp lock held. Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)
|
#
1037bbb1 |
|
08-Feb-2002 |
Peter Wemm <peter@FreeBSD.org> |
Fix broken Giant locking protocol introduced in rev 1.114. You cannot unlock Giant if it is not locked in the first place. This make the nfstat(2) syscall (#278) a nice panic(2) implementation.
|
#
3865fa13 |
|
01-Feb-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove bogus assertion in dup2 that can lead to panics when kernel threads race for a file slot. dup2(2) incorrectly assumes that if it needs to grow the ofiles array that it will get what it wants. This assertion was valid before we allowed shared filedescriptor tables but is now incorrect. The assertion can trigger superfolous panics if the thread doing a dup2 looses a race with another thread while possibly blocked in the MALLOC call in fdalloc. Another thread may grab the slot we are requesting which makes fdalloc return something other than what we asked for, this will triggering the bogus assertion. MFC after: 2 weeks Reviewed by: phk
|
#
2b397439 |
|
01-Feb-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Avoid lock order reversal filedesc/Giant when calling FREE() in fdalloc by unlocking the filedesc before calling FREE(). Submitted by: bde
|
#
eb209311 |
|
29-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Attempt to fixup select(2) and poll(2), this should fix some races with other threads as well as speed up the interfaces. To fix the race and accomplish the speedup, remove selholddrop and pollholddrop. The entire concept is somewhat bogus because holding the individual struct file pointers offers us no guarantees that another thread context won't close it on us thereby removing our access to our own reference. Selholddrop and pollholddrop also would do multiple locks and unlocks of mutexes _per-file_ in the fd arrays to be scanned, this needed to be sped up. Instead of using selholddrop and pollholddrop, simply hold the filedesc lock over the selscan and pollscan functions. This should protect us against close(2)'s on the files as reduce the multiple lock/unlock pairs per fd into a single lock over the filedesc.
|
#
5980a85f |
|
29-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Backout 1.120, EINVAL isn't a proper error return when the passed fd is negative, the 'pointer' referred to by the manpage is actually the struct file's f_offset field. Pointed out by: bde
|
#
095f670d |
|
23-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
in fget() return EINVAL when the descriptor requested is negative.
|
#
767567d3 |
|
20-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
use mutex pools for "struct file" locking. fix indentation of FILE_LOCK/UNLOCK macros while I'm here.
|
#
74aac58b |
|
14-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Push down Giant in dup(2) and dup2(2), Giant is only needed when calling closef() in the case of dup2(2) duping over a descriptor and when fdalloc must grow or free a filedesc.
|
#
a4db4953 |
|
13-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Replace ffind_* with fget calls. Make fget MPsafe. Make fgetvp and fgetsock use the fget subsystem to reduce code bloat. Push giant down in fpathconf().
|
#
ba868b0d |
|
12-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Comment fdrop and fdrop_locked functions.
|
#
c2824dd4 |
|
12-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Implement ffind_hold using ffind_lock. Recommended by: jhb
|
#
426da3bc |
|
13-Jan-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
SMP Lock struct file, filedesc and the global file list. Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file *fp); /* increments reference count on a file */ struct file * fhold_locked(struct file *fp); /* like fhold but expects file to locked */ struct file * ffind_hold(struct thread *, int fd); /* finds the struct file in thread, adds one reference and returns it unlocked */ struct file * ffind_lock(struct thread *, int fd); /* ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.
|
#
2b846bd3 |
|
14-Dec-2001 |
Jonathan Lemon <jlemon@FreeBSD.org> |
When removing kqueue descriptors from the descriptor table during a fork, update fd_freefile and fd_lastfile as well, to keep things in sync. Pointed out by: Debbie Chu <dchu@juniper.net>
|
#
b1e4abd2 |
|
16-Nov-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Give struct socket structures a ref counting interface similar to vnodes. This will hopefully serve as a base from which we can expand the MP code. We currently do not attempt to obtain any mutex or SX locks, but the door is open to add them when we nail down exactly how that part of it is going to work.
|
#
b064d43d |
|
13-Nov-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
remove holdfp() Replace uses of holdfp() with fget*() or fgetvp*() calls as appropriate introduce fget(), fget_read(), fget_write() - these functions will take a thread and file descriptor and return a file pointer with its ref count bumped. introduce fgetvp(), fgetvp_read(), fgetvp_write() - these functions will take a thread and file descriptor and return a vref()'d vnode. *_read() requires that the file pointer be FREAD, *_write that it be FWRITE. This continues the cleanup of struct filedesc and struct file access routines which, when are all through with it, will allow us to then make the API calls MP safe and be able to move Giant down into the fo_* functions.
|
#
bd78cece |
|
11-Oct-2001 |
John Baldwin <jhb@FreeBSD.org> |
Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.
|
#
1a6fc8ef |
|
30-Sep-2001 |
Jonathan Lemon <jlemon@FreeBSD.org> |
When FREE()ing kqueue related structures, charge them to the correct bucket. Submitted by: iedowse Forgotten by: jlemon
|
#
9dbea923 |
|
12-Sep-2001 |
Julian Elischer <julian@FreeBSD.org> |
If an incoming struct proc could have been NULL before, tehn don't automatically change the code to add struct proc *p = td->td_proc; because now 'td' is probably capable of being NULL too. I expect to see more of this kind of error during the 'weeding' process. It's too easy to make. (junior hacker project.. look for these :-) Submitted by: mark Peek <mp@freebsd.org>
|
#
b40ce416 |
|
12-Sep-2001 |
Julian Elischer <julian@FreeBSD.org> |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
|
#
835a82ee |
|
01-Sep-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Giant Pushdown. Saved the worst P4 tree breakage for last. reboot() getpriority() setpriority() rtprio() osetrlimit() ogetrlimit() setrlimit() getrlimit() getrusage() getpid() getppid() getpgrp() getpgid() getsid() getgid() getegid() getgroups() setsid() setpgid() setuid() seteuid() setgid() setegid() setgroups() setreuid() setregid() setresuid() setresgid() getresuid() getresgid () __setugid() getlogin() setlogin() modnext() modfnext() modstat() modfind() kldload() kldunload() kldfind() kldnext() kldstat() kldfirstmod() kldsym() getdtablesize() dup2() dup() fcntl() close() ofstat() fstat() nfsstat() fpathconf() flock()
|
#
c8e76343 |
|
29-Aug-2001 |
Andrey A. Chernov <ache@FreeBSD.org> |
advlock: simplify overflow checks
|
#
4b207d98 |
|
23-Aug-2001 |
Andrey A. Chernov <ache@FreeBSD.org> |
Move <machine/*> after <sys/*> Add missing fdrop() before EOVERFLOW Pointed by: bde
|
#
69cc1d0d |
|
23-Aug-2001 |
Andrey A. Chernov <ache@FreeBSD.org> |
Detect off_t EOVERFLOW of start/end offsets calculations for adv. lock, as POSIX require.
|
#
c30d4da3 |
|
05-Aug-2001 |
Chris Costello <chris@FreeBSD.org> |
Remove the fildesc_clone() function and its associated unnecessary code. It didn't implement the proper /dev/fd functionality (which would be to include in the directory listing /dev/fd/n if the process has fd n open) anyway. Anything needing access to /dev/fd/n where n > 2 can use the optional fdescfs module, which implements this properly and does not cause any trouble with devfs. Discussed with: phk
|
#
b1fc0ec1 |
|
25-May-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account. Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit
|
#
fb919e4d |
|
01-May-2001 |
Mark Murray <markm@FreeBSD.org> |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
|
#
33a9ed9d |
|
23-Apr-2001 |
John Baldwin <jhb@FreeBSD.org> |
Change the pfind() and zpfind() functions to lock the process that they find before releasing the allproc lock and returning. Reviewed by: -smp, dfr, jake
|
#
f8388051 |
|
25-Mar-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Send the remains (such as I have located) of "block major numbers" to the bit-bucket.
|
#
71d03311 |
|
20-Mar-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make the pseudo-driver for "/dev/fd/*" handle fd's larger than 255. PR: 25936
|
#
608a3ce6 |
|
15-Feb-2001 |
Jonathan Lemon <jlemon@FreeBSD.org> |
Extend kqueue down to the device layer. Backwards compatible approach suggested by: peter
|
#
7cc0979f |
|
08-Dec-2000 |
David Malone <dwmalone@FreeBSD.org> |
Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>
|
#
279d7226 |
|
18-Nov-2000 |
Matthew Dillon <dillon@FreeBSD.org> |
This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc... PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>
|
#
4a71feb7 |
|
28-Oct-2000 |
Alan Cox <alc@FreeBSD.org> |
Add missing call to knote_fdclose() in setugidsafety() and fdcloseexec(). Reviewed by: jlemon
|
#
db901281 |
|
02-Sep-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Avoid the modules madness I inadvertently introduced by making the cloning infrastructure standard in kern_conf. Modules are now the same with or without devfs support. If you need to detect if devfs is present, in modules or elsewhere, check the integer variable "devfs_present". This happily removes an ugly hack from kern/vfs_conf.c. This forces a rename of the eventhandler and the standard clone helper function. Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include like <sys/queue.h> Remove all #includes of opt_devfs.h they no longer matter.
|
#
c58b821e |
|
26-Aug-2000 |
Alfred Perlstein <alfred@FreeBSD.org> |
new sysctl 'kern.openfiles' (exports nfiles to userland)
|
#
d8cd1501 |
|
24-Aug-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Dang, a _clone routine escaped #ifdef DEVFS containment.
|
#
a481b90b |
|
24-Aug-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix panic when removing open device (found by bp@) Implement subdirs. Build the full "devicename" for cloning functions. Fix panic when deleted device goes away. Collaps devfs_dir and devfs_dirent structures. Add proper cloning to the /dev/fd* "device-"driver. Fix a bug in make_dev_alias() handling which made aliases appear multiple times. Use devfs_clone to implement getdiskbyname() Make specfs maintain the stat(2) timestamps per dev_t
|
#
37b087a6 |
|
11-Aug-2000 |
Peter Wemm <peter@FreeBSD.org> |
Clean up some low level bootstrap code: - stop using the evil 'struct trapframe' argument for mi_startup() (formerly main()). There are much better ways of doing it. - do not use prepare_usermode() - setregs() in execve() will do it all for us as long as the p_md.md_regs pointer is set. (which is now done in machdep.c rather than init_main.c. The Alpha port did it this way all along and is much cleaner). - collect all the magic %cr0 etc register settings into one place and have the AP's call that instead of using magic numbers (!!) that keep changing over and over again. - Make it safe to call kthread_create() earlier, including during the device probe sequence. It doesn't need the callback mechanism that NetBSD's version uses. - kthreads created this way are root-less as they exist before the root filesystem is mounted. init(1) is set up so that it aquires the root pointers prior to running. If other kthreads want filesystem acccess we can make this code more generic. - set all threads start times once we have decided what time it is. - init uses a trampoline rather than the evil prepare_usermode() hack. - kern_descrip.c has a couple of tweaks to deal with forking when there is no rootdir or cwd etc. - adjust the early SYSINIT() sequence so that a few prereqisites are in place. eg: make sure the run queue is initialized before doing forks. With this, the USB code can easily create a kthread to do the device tree discovery. (I have tested it, it works nicely). There are still some open issues before this is truely useful. - tsleep() does not like working before the clock is running. It sort-of tries to spin wait, but it can do more useful things now. - stopping a kthread in kld code at unload time is "interesting" but we have a solution for that. The Alpha code needs no changes for this. It already uses pretty much the same strategies, but a little cleaner.
|
#
77978ab8 |
|
04-Jul-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Previous commit changing SYSCTL_HANDLER_ARGS violated KNF. Pointed out by: bde
|
#
82d9ae4e |
|
03-Jul-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Style police catches up with rev 1.26 of src/sys/sys/sysctl.h: Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources: -sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
|
#
1a61fa5e |
|
27-Jun-2000 |
Alfred Perlstein <alfred@FreeBSD.org> |
don't panic the system when fpathconv is called on an unsupported filetype.
|
#
e3975643 |
|
25-May-2000 |
Jake Burkholder <jake@FreeBSD.org> |
Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others
|
#
740a1973 |
|
23-May-2000 |
Jake Burkholder <jake@FreeBSD.org> |
Change the way that the queue(3) structures are declared; don't assume that the type argument to *_HEAD and *_ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd
|
#
cb679c38 |
|
16-Apr-2000 |
Jonathan Lemon <jlemon@FreeBSD.org> |
Introduce kqueue() and kevent(), a kernel event notification facility.
|
#
27e2c03a |
|
20-Jan-2000 |
Warner Losh <imp@FreeBSD.org> |
Fix the style bugs in the style bugs fix. The style bug fix made the new function inconsistant with the rest of this file. The spelling and grammer fixes were good and remain.
|
#
bd9079fa |
|
20-Jan-2000 |
Brian Feldman <green@FreeBSD.org> |
Fix style bugs in the last commit.
|
#
7001be49 |
|
20-Jan-2000 |
Warner Losh <imp@FreeBSD.org> |
bdeize last commit: o Remove opt_dontuse.h and ifdef PROCFS Subitted by: bde, peter
|
#
5e266442 |
|
20-Jan-2000 |
Warner Losh <imp@FreeBSD.org> |
When we are execing a setugid program, and we have a procfs filesystem file open in one of the special file descriptors (0, 1, or 2), close it before completing the exec. Submitted by: nergal@idea.avet.com.pl Constructive comments: deraadt@openbsd.org, sef, peter, jkh
|
#
f85bdfcc |
|
26-Dec-1999 |
Bruce Evans <bde@FreeBSD.org> |
Removed unused includes. Rumoved unused compatibility cruft for dup(). Using it today would just break dup() on fd's >= 64. Fixed some style bugs.
|
#
151f7a5d |
|
18-Nov-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Only bother converting the stat structure if we intend to return it, when no error occurs. PR: kern/14966 Reviewed by: dillon@freebsd.org Submitted by: Kelly Yancey kbyanc@posi.net
|
#
2c77a71d |
|
17-Nov-1999 |
Peter Wemm <peter@FreeBSD.org> |
Remove cdevsw_add() - the necessary make_dev() calls appear to be there already.
|
#
2e3c8fcb |
|
16-Nov-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
This is a partial commit of the patch from PR 14914: Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. This batch of changes compile to the same object files. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914
|
#
cf87559c |
|
07-Nov-1999 |
Peter Wemm <peter@FreeBSD.org> |
Use fo_stat() rather than duplicating knowledge of file type internals in here for stat(2) and friends. Update the badops entries accordingly.
|
#
d91e41c8 |
|
06-Nov-1999 |
Brian Feldman <green@FreeBSD.org> |
Fix the advisory file locking by restoring previous ordering in closef()/ fdrop(). This only showed up when a file descriptor was duplicated and then closed once, where the lock would be released on the first close().
|
#
d1f088da |
|
11-Oct-1999 |
Peter Wemm <peter@FreeBSD.org> |
Trim unused options (or #ifdef for undoc options). Submitted by: phk
|
#
d6a0e38a |
|
25-Sep-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove five now unused fields from struct cdevsw. They should never have been there in the first place. A GENERIC kernel shrinks almost 1k. Add a slightly different safetybelt under nostop for tty drivers. Add some missing FreeBSD tags
|
#
2fe5bd8b |
|
25-Sep-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix a hole in jail(2). Noticed by: Alexander Bezroutchko <abb@zenon.net>
|
#
13ccadd4 |
|
19-Sep-1999 |
Brian Feldman <green@FreeBSD.org> |
This is what was "fdfix2.patch," a fix for fd sharing. It's pretty far-reaching in fd-land, so you'll want to consult the code for changes. The biggest change is that now, you don't use fp->f_ops->fo_foo(fp, bar) but instead fo_foo(fp, bar), which increments and decrements the fp refcount upon entry and exit. Two new calls, fhold() and fdrop(), are provided. Each does what it seems like it should, and if fdrop() brings the refcount to zero, the fd is freed as well. Thanks to peter ("to hell with it, it looks ok to me.") for his review. Thanks to msmith for keeping me from putting locks everywhere :) Reviewed by: peter
|
#
c3aac50f |
|
27-Aug-1999 |
Peter Wemm <peter@FreeBSD.org> |
$Id$ -> $FreeBSD$
|
#
9dcbe240 |
|
23-Aug-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Convert DEVFS hooks in (most) drivers to make_dev(). Diskslice/label code not yet handled. Vinum, i4b, alpha, pc98 not dealt with (left to respective Maintainers) Add the correct hook for devfs to kern_conf.c The net result of this excercise is that a lot less files depends on DEVFS, and devtoname() gets more sensible output in many cases. A few drivers had minor additional cleanups performed relating to cdevsw registration. A few drivers don't register a cdevsw{} anymore, but only use make_dev().
|
#
e32c66c5 |
|
04-Aug-1999 |
Brian Feldman <green@FreeBSD.org> |
Fix fd race conditions (during shared fd table usage.) Badfileops is now used in f_ops in place of NULL, and modifications to the files are more carefully ordered. f_ops should also be set to &badfileops upon "close" of a file. This does not fix other problems mentioned in this PR than the first one. PR: 11629 Reviewed by: peter
|
#
79fc0bf4 |
|
07-Jun-1999 |
Mike Smith <msmith@FreeBSD.org> |
From the submitter: - this causes POSIX locking to use the thread group leader (p->p_leader) as the locking thread for all advisory locks. In non-kernel-threaded code p->p_leader == p, so this will have no effect. This results in (more) correct POSIX threaded flock-ing semantics. It also prevents the leader from exiting before any of the children. (so that p->p_leader will never be stale) in exit1(). We have been running this patch for over a month now in our lab under load and at customer sites. Submitted by: John Plevyak <jplevyak@inktomi.com>
|
#
2447bec8 |
|
31-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Simplify cdevsw registration. The cdevsw_add() function now finds the major number(s) in the struct cdevsw passed to it. cdevsw_add_generic() is no longer needed, cdevsw_add() does the same thing. cdevsw_add() will print an message if the d_maj field looks bogus. Remove nblkdev and nchrdev variables. Most places they were used bogusly. Instead check a dev_t for validity by seeing if devsw() or bdevsw() returns NULL. Move bdevsw() and devsw() functions to kern/kern_conf.c Bump __FreeBSD_version to 400006 This commit removes: 72 bogus makedev() calls 26 bogus SYSINIT functions if_xe.c bogusly accessed cdevsw[], author/maintainer please fix. I4b and vinum not changed. Patches emailed to authors. LINT probably broken until they catch up.
|
#
4e2f199e |
|
30-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
This commit should be a extensive NO-OP: Reformat and initialize correctly all "struct cdevsw". Initialize the d_maj and d_bmaj fields. The d_reset field was not removed, although it is never used. I used a program to do most of this, so all the files now use the same consistent format. Please keep it that way. Vinum and i4b not modified, patches emailed to respective authors.
|
#
bfbb9ce6 |
|
11-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Divorce "dev_t" from the "major|minor" bitmap, which is now called udev_t in the kernel but still called dev_t in userland. Provide functions to manipulate both types: major() umajor() minor() uminor() makedev() umakedev() dev2udev() udev2dev() For now they're functions, they will become in-line functions after one of the next two steps in this process. Return major/minor/makedev to macro-hood for userland. Register a name in cdevsw[] for the "filedescriptor" driver. In the kernel the udev_t appears in places where we have the major/minor number combination, (ie: a potential device: we may not have the driver nor the device), like in inodes, vattr, cdevsw registration and so on, whereas the dev_t appears where we carry around a reference to a actual device. In the future the cdevsw and the aliased-from vnode will be hung directly from the dev_t, along with up to two softc pointers for the device driver and a few houskeeping bits. This will essentially replace the current "alias" check code (same buck, bigger bang). A little stunt has been provided to try to catch places where the wrong type is being used (dev_t vs udev_t), if you see something not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if it makes a difference. If it does, please try to track it down (many hands make light work) or at least try to reproduce it as simply as possible, and describe how to do that. Without DEVT_FASCIST I belive this patch is a no-op. Stylistic/posixoid comments about the userland view of the <sys/*.h> files welcome now, from userland they now contain the end result. Next planned step: make all dev_t's refer to the same devsw[] which means convert BLK's to CHR's at the perimeter of the vnodes and other places where they enter the game (bootdev, mknod, sysctl).
|
#
3d177f46 |
|
03-May-1999 |
Bill Fumerola <billf@FreeBSD.org> |
Add sysctl descriptions to many SYSCTL_XXXs PR: kern/11197 Submitted by: Adrian Chadd <adrian@FreeBSD.org> Reviewed by: billf(spelling/style/minor nits) Looked at by: bde(style)
|
#
604359cf |
|
28-Apr-1999 |
Dmitrij Tejblum <dt@FreeBSD.org> |
s/static foo_devsw_installed = 0;/static int foo_devsw_installed;/. (Edited automatically)
|
#
5526d2d9 |
|
08-Jan-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as discussed on -hackers. Introduce 'KASSERT(assertion, ("panic message", args))' for simple check + panic. Reviewed by: msmith
|
#
62d6ce3a |
|
11-Nov-1998 |
Don Lewis <truckman@FreeBSD.org> |
I got another batch of suggestions for cosmetic changes from bde.
|
#
831d27a9 |
|
11-Nov-1998 |
Don Lewis <truckman@FreeBSD.org> |
Installed the second patch attached to kern/7899 with some changes suggested by bde, a few other tweaks to get the patch to apply cleanly again and some improvements to the comments. This change closes some fairly minor security holes associated with F_SETOWN, fixes a few bugs, and removes some limitations that F_SETOWN had on tty devices. For more details, see the description on the PR. Because this patch increases the size of the proc and pgrp structures, it is necessary to re-install the includes and recompile libkvm, the vinum lkm, fstat, gcore, gdb, ipfilter, ps, top, and w. PR: kern/7899 Reviewed by: bde, elvind
|
#
d974cf4d |
|
29-Jul-1998 |
Bruce Evans <bde@FreeBSD.org> |
Fixed printf format errors.
|
#
1ede4662 |
|
15-Jul-1998 |
Bruce Evans <bde@FreeBSD.org> |
Cast longs to intptr_t before casting them to pointers. Fixed bitrot in pseudo-declaration of `struct fcntl_args'. fcntl() is now broken in some cases when ints are larger than longs.
|
#
2ef49ddf |
|
10-Jun-1998 |
Doug Rabson <dfr@FreeBSD.org> |
64bit fixes: p->p_retval is a register_t[] not an int[].
|
#
1f562172 |
|
10-May-1998 |
John Dyson <dyson@FreeBSD.org> |
Fix the futimes/undelete/utrace conflict with other BSD's. Note that the only common usage of utrace (the possible problem with this commit) is with malloc, so this should be a real problem. Add the various NetBSD syscalls that allow full emulation of their development environment.
|
#
9f24f214 |
|
14-Feb-1998 |
John Dyson <dyson@FreeBSD.org> |
Make the rootdir handling more consistent. Now, processes always have a root vnode associated with them, and no special checks for the null case are needed. Submitted by: terry@freebsd.org
|
#
0b08f5f7 |
|
05-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Back out DIAGNOSTIC changes.
|
#
47cfdb16 |
|
04-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Turn DIAGNOSTIC into a new-style option.
|
#
7b778b5e |
|
23-Jan-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Make all file-system (MFS, FFS, NFS, LFS, DEVFS) related option new-style. This introduce an xxxFS_BOOT for each of the rootable filesystems. (Presently not required, but encouraged to allow a smooth move of option *FS to opt_dontuse.h later.) LFS is temporarily disabled, and will be re-enabled tomorrow.
|
#
5591b823d |
|
16-Dec-1997 |
Eivind Eklund <eivind@FreeBSD.org> |
Make COMPAT_43 and COMPAT_SUNOS new-style options.
|
#
fd3bf775 |
|
28-Nov-1997 |
John Dyson <dyson@FreeBSD.org> |
Fix and complete the AIO syscalls. There are some performance enhancements coming up soon, but the code is functional. Docs will be forthcoming.
|
#
a3c78a76 |
|
22-Nov-1997 |
Bruce Evans <bde@FreeBSD.org> |
Fixed a missing conversion of retval to p_retval in disabled code. Fixed overflow of FFLAGS() in fcntl(F_SETFL, ...). This was not a security hole, but gave wrong results for silly flags values. E.g., it make fcntl(F_SETFL, -1) equivalent to fcntl(F_SETFL, 0). POSIX requires ignoring the open mode bits in fcntl() (even if they would be invalid for open()).
|
#
d826c479 |
|
23-Nov-1997 |
Bruce Evans <bde@FreeBSD.org> |
Fixed duplicate definitions of M_FILE (one static).
|
#
cb226aaa |
|
06-Nov-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead. This fixes a boatload of compiler warning, and removes a lot of cruft from the sources. I have not removed the /*ARGSUSED*/, they will require some looking at. libkvm, ps and other userland struct proc frobbing programs will need recompiled.
|
#
a1c995b6 |
|
12-Oct-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Last major round (Unless Bruce thinks of somthing :-) of malloc changes. Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde
|
#
55166637 |
|
11-Oct-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Distribute and statizice a lot of the malloc M_* types. Substantial input from: bde
|
#
51338ea8 |
|
13-Sep-1997 |
Peter Wemm <peter@FreeBSD.org> |
Various select -> poll changes
|
#
32545fd1 |
|
25-Aug-1997 |
Bruce Evans <bde@FreeBSD.org> |
Removed some stale comments. Fixed a gratuitous ANSIism.
|
#
9dd8309d |
|
09-Apr-1997 |
Bruce Evans <bde@FreeBSD.org> |
Removed support for OLD_PIPE. <sys/stat.h> is now missing the hack that supported nameless pipes being indistinguishable from fifos. We're not going back.
|
#
6875d254 |
|
22-Feb-1997 |
Peter Wemm <peter@FreeBSD.org> |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
1130b656 |
|
14-Jan-1997 |
Jordan K. Hubbard <jkh@FreeBSD.org> |
Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
8b612c4b |
|
28-Dec-1996 |
John Dyson <dyson@FreeBSD.org> |
This commit is the embodiment of some VFS read clustering improvements. Firstly, now our read-ahead clustering is on a file descriptor basis and not on a per-vnode basis. This will allow multiple processes reading the same file to take advantage of read-ahead clustering. Secondly, there previously was a problem with large reads still using the ramp-up algorithm. Of course, that was bogus, and now we read the entire "chunk" off of the disk in one operation. The read-ahead clustering algorithm should use less CPU than the previous also (I hope :-)). NOTE: THAT LKMS MUST BE REBUILT!!!
|
#
b8b6f501 |
|
19-Dec-1996 |
Bruce Evans <bde@FreeBSD.org> |
Fixed nonexistent checking of lock types for F_GETLK. Found by: NIST-PCTS
|
#
bb65f5a1 |
|
19-Dec-1996 |
Bruce Evans <bde@FreeBSD.org> |
Fixed lseek() on named pipes. It always succeeded but should always fail. Broke locking on named pipes in the same way as locking on non-vnodes (wrong errno). This will be fixed later. The fix involves negative logic. Named pipes are now distinguished from other types of files with vnodes, and there is additional code to handle vnodes and named pipes in the same way only where that makes sense (not for lseek, locking or TIOCSCTTY).
|
#
efebc4ab |
|
28-Sep-1996 |
Bruce Evans <bde@FreeBSD.org> |
Fixed bitrot in the read-only attribute: - kern.maxfilesperproc was read-only (and thus essentially useless). Removed unused #includes. Strength-reduced used #includes.
|
#
de71b880 |
|
15-Aug-1996 |
Sujal Patel <smpatel@FreeBSD.org> |
Fix fdavail() so that correctly pays attention to the rlimit. Fixes unp_externalize panic which occurs when a process is at it's ulimit for file descriptors and tries to receive a file descriptor from another process. Reviewed by: wollman
|
#
8a095c52 |
|
17-Jun-1996 |
Bill Paul <wpaul@FreeBSD.org> |
Add a couple of #ifdef DEVFS/#endif clauses to slence the following compiler warnings which occur if you don't have 'options DEVFS' in your kernel config file: ../../kern/kern_descrip.c: In function `fildesc_drvinit': ../../kern/kern_descrip.c:1103: warning: unused variable `fd' ../../kern/kern_descrip.c: At top level: ../../kern/kern_descrip.c:1095: warning: `devfs_token_stdin' defined but not use d ../../kern/kern_descrip.c:1096: warning: `devfs_token_stdout' defined but not us ed ../../kern/kern_descrip.c:1097: warning: `devfs_token_stderr' defined but not us ed ../../kern/kern_descrip.c:1098: warning: `devfs_token_fildesc' defined but not u sed
|
#
c23670e2 |
|
11-Jun-1996 |
Gary Palmer <gpalmer@FreeBSD.org> |
Clean up -Wunused warnings. Reviewed by: bde
|
#
8fb33324 |
|
27-Mar-1996 |
Bruce Evans <bde@FreeBSD.org> |
Fixed the unit numbers of the devfs `fd' devices. Made the devfs `fd' devices bug for bug compatible with the ones created by MAKEDEV: - ownership is bin.bin, not root.wheel, except for std*. The devfsext interface doesn't seem to allow specifying the ownership of /devfs/fd, so it's still incompatible. - std* aren't links to fd/[0-2].
|
#
edbfedac |
|
11-Mar-1996 |
Peter Wemm <peter@FreeBSD.org> |
Import 4.4BSD-Lite2 onto the vendor branch, note that in the kernel, all files are off the vendor branch, so this should not change anything. A "U" marker generally means that the file was not changed in between the 4.4Lite and Lite-2 releases, and does not need a merge. "C" generally means that there was a change. [note new unused (in this form) syscalls.conf, to be 'cvs rm'ed]
|
#
4b50ceef |
|
10-Mar-1996 |
Jeffrey Hsu <hsu@FreeBSD.org> |
Merge in Lite2: LIST replacement for f_filef, f_fileb, and filehead. Did not accept change of second argument to ioctl from int to u_long. Reviewed by: davidg & bde
|
#
dabee6fe |
|
23-Feb-1996 |
Peter Wemm <peter@FreeBSD.org> |
kern_descrip.c: add fdshare()/fdcopy() kern_fork.c: add the tiny bit of code for rfork operation. kern/sysv_*: shmfork() takes one less arg, it was never used. sys/shm.h: drop "isvfork" arg from shmfork() prototype sys/param.h: declare rfork args.. (this is where OpenBSD put it..) sys/filedesc.h: protos for fdshare/fdcopy. vm/vm_mmap.c: add minherit code, add rounding to mmap() type args where it makes sense. vm/*: drop unused isvfork arg. Note: this rfork() implementation copies the address space mappings, it does not connect the mappings together. ie: once the two processes have split, the pages may be shared, but the address space is not. If one does a mmap() etc, it does not appear in the other. This makes it not useful for pthreads, but it is useful in it's own right for having light-weight threads in a static shared address space. Obtained from: Original by Ron Minnich, extended by OpenBSD
|
#
2834ceec |
|
04-Feb-1996 |
John Dyson <dyson@FreeBSD.org> |
Improve the performance for pipe(2) again. Also include some fixes for previous version of new pipes from Bruce Evans. This new version: Supports more properly the semantics of select (BDE). Supports "OLD_PIPE" correctly (kern_descrip.c, BDE). Eliminates incorrect EPIPE returns (bash 'pipe broken' messages.) Much faster yet, currently tuned relatively conservatively -- but now gives approx 50% more perf than the new pipes code did originally. (That was about 50% more perf than the original BSD pipe code.) Known bugs outstanding: No support for async io (SIGIO). Will be included soon. Next to do: Merge support for FIFOs. Submitted by: bde
|
#
f9827213 |
|
28-Jan-1996 |
John Dyson <dyson@FreeBSD.org> |
Enable the new fast pipe code. The old pipes can be used with the "OLD_PIPE" config option.
|
#
87b6de2b |
|
14-Dec-1995 |
Poul-Henning Kamp <phk@FreeBSD.org> |
A Major staticize sweep. Generates a couple of warnings that I'll deal with later. A number of unused vars removed. A number of unused procs removed or #ifdefed.
|
#
d2f265fa |
|
08-Dec-1995 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Julian forgot to make the *devsw structures static.
|
#
87f6c662 |
|
08-Dec-1995 |
Julian Elischer <julian@FreeBSD.org> |
Pass 3 of the great devsw changes most devsw referenced functions are now static, as they are in the same file as their devsw structure. I've also added DEVFS support for nearly every device in the system, however many of the devices have 'incorrect' names under DEVFS because I couldn't quickly work out the correct naming conventions. (but devfs won't be coming on line for a month or so anyhow so that doesn't matter) If you "OWN" a device which would normally have an entry in /dev then search for the devfs_add_devsw() entries and munge to make them right.. check out similar devices to see what I might have done in them in you can't see what's going on.. for a laugh compare conf.c conf.h defore and after... :) I have not doen DEVFS entries for any DISKSLICE devices yet as that will be a much more complicated job.. (pass 5 :) pass 4 will be to make the devsw tables of type (cdevsw * ) rather than (cdevsw) seems to work here.. complaints to the usual places.. :)
|
#
efeaf95a |
|
06-Dec-1995 |
David Greenman <dg@FreeBSD.org> |
Untangled the vm.h include file spaghetti.
|
#
4cb03b1b |
|
05-Dec-1995 |
Bruce Evans <bde@FreeBSD.org> |
Include <vm/vm.h> or <vm/vm_page.h> explicitly to avoid breaking when vnode_if.h doesn't include vm stuff.
|
#
946bb7a2 |
|
04-Dec-1995 |
Poul-Henning Kamp <phk@FreeBSD.org> |
A major sweep over the sysctl stuff. Move a lot of variables home to their own code (In good time before xmas :-) Introduce the string descrition of format. Add a couple more functions to poke into these marvels, while I try to decide what the correct interface should look like. Next is adding vars on the fly, and sysctl looking at them too. Removed a tine bit of defunct and #ifdefed notused code in swapgeneric.
|
#
98d93822 |
|
02-Dec-1995 |
Bruce Evans <bde@FreeBSD.org> |
Completed function declarations and/or added prototypes.
|
#
7198bf47 |
|
29-Nov-1995 |
Julian Elischer <julian@FreeBSD.org> |
If you're going to mechanically replicate something in 50 files it's best to not have a (compiles cleanly) typo in it! (sigh)
|
#
53ac6efb |
|
29-Nov-1995 |
Julian Elischer <julian@FreeBSD.org> |
OK, that's it.. That's EVERY SINGLE driver that has an entry in conf.c.. my next trick will be to define cdevsw[] and bdevsw[] as empty arrays and remove all those DAMNED defines as well.. Each of these drivers has a SYSINIT linker set entry that comes in very early.. and asks teh driver to add it's own entry to the two devsw[] tables. some slight reworking of the commits from yesterday (added the SYSINIT stuff and some usually wrong but token DEVFS entries to all these devices. BTW does anyone know where the 'ata' entries in conf.c actually reside? seems we don't actually have a 'ataopen() etc... If you want to add a new device in conf.c please make sure I know so I can keep it up to date too.. as before, this is all dependent on #if defined(JREMOD) (and #ifdef DEVFS in parts)
|
#
18e6fe02 |
|
14-Nov-1995 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add new-style sysctl for KERN_FILE here.
|
#
d2d3e875 |
|
11-Nov-1995 |
Bruce Evans <bde@FreeBSD.org> |
Included <sys/sysproto.h> to get central declarations for syscall args structs and prototypes for syscalls. Ifdefed duplicated decentralized declarations of args structs. It's convenient to have this visible but they are hard to maintain. Some are already different from the central declarations. 4.4lite2 puts them in comments in the function headers but I wanted to avoid the large changes for that.
|
#
079cc25b |
|
21-Oct-1995 |
David Greenman <dg@FreeBSD.org> |
Killed a few gratuitous #include's.
|
#
ad7507e2 |
|
07-Oct-1995 |
Steven Wallace <swallace@FreeBSD.org> |
Remove prototype definitions from <sys/systm.h>. Prototypes are located in <sys/sysproto.h>. Add appropriate #include <sys/sysproto.h> to files that needed protos from systm.h. Add structure definitions to appropriate files that relied on sys/systm.h, right before system call definition, as in the rest of the kernel source. In kern_prot.c, instead of using the dummy structure "args", create individual dummy structures named <syscall>_args. This makes life easier for prototype generation.
|
#
9b2e5354 |
|
30-May-1995 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
Remove trailing whitespace.
|
#
3aa12267 |
|
28-Mar-1995 |
Bruce Evans <bde@FreeBSD.org> |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) that I didn't notice when I fixed "all" such warnings before.
|
#
e6373c9e |
|
20-Feb-1995 |
Guido van Rooij <guido@FreeBSD.org> |
Implement maxprocperuid and maxfilesperproc. They are tunable via sysctl(8). The initial value of maxprocperuid is maxproc-1, that of maxfilesperproc is maxfiles (untill maxfile will disappear) Now it is at least possible to prohibit one user opening maxfiles -Guido Submitted by: Obtained from:
|
#
20989d2d |
|
11-Dec-1994 |
Bruce Evans <bde@FreeBSD.org> |
Obtained from: my fix for 1.1.5 Remove compatibility hack so that dup(fd) isn't interpreted as dup2(fd & 0x3f, random_junk_on_stack_fd) when (fd & 0x3f) != 0.
|
#
797f2d22 |
|
02-Oct-1994 |
Poul-Henning Kamp <phk@FreeBSD.org> |
All of this is cosmetic. prototypes, #includes, printfs and so on. Makes GCC a lot more silent.
|
#
bb56ec4a |
|
25-Sep-1994 |
Poul-Henning Kamp <phk@FreeBSD.org> |
While in the real world, I had a bad case of being swapped out for a lot of cycles. While waiting there I added a lot of the extra ()'s I have, (I have never used LISP to any extent). So I compiled the kernel with -Wall and shut up a lot of "suggest you add ()'s", removed a bunch of unused var's and added a couple of declarations here and there. Having a lap-top is highly recommended. My kernel still runs, yell at me if you kernel breaks.
|
#
b36a2ba1 |
|
02-Sep-1994 |
David Greenman <dg@FreeBSD.org> |
munmapfd() was being called with one too few params - bug introduced during my initial kernel port.
|
#
3c4dd356 |
|
02-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Added $Id$
|
#
26f9a767 |
|
25-May-1994 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
df8bae1d |
|
24-May-1994 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
BSD 4.4 Lite Kernel Sources
|