#
e30621d5 |
|
18-May-2024 |
Ricardo Branco <rbranco@suse.de> |
mqueue: Introduce kern_kmq_timedreceive & kern_kmq_timedsend Reviewed by: imp, kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1248
|
#
289b2d6a |
|
17-May-2024 |
Ricardo Branco <rbranco@suse.de> |
mqueue: Export some functions to be used by Linuxulator Reviewed by: imp, kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1248
|
#
ddbfb544 |
|
15-May-2024 |
Ricardo Branco <rbranco@suse.de> |
mqueuefs: Relax restriction that path must begin with a slash This is needed to support Linux implementation which discards the leading slash when calling mq_open(2) Reviewed by: imp, kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1248
|
#
acb7a4de |
|
11-May-2024 |
Ricardo Branco <rbranco@suse.de> |
mqueue: Add sysctl for default_maxmsg & default_msgsize and fix descriptions Reviewed by: imp, kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1248
|
#
f0a4dd6d |
|
21-May-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
mqueuefs: mark newly allocated vnode as constructed, under the lock Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
b6f4a3fa |
|
21-May-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
mqueuefs: uma_zfree() can be postponed until mqfs sx mi_lock is dropped Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
63f18b37 |
|
21-May-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
mqueuefs: minor style pass Also remove not needed inclusion of sys/cdefs.h. Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
b307cfe4 |
|
01-Mar-2024 |
Stefan Eßer <se@FreeBSD.org> |
mqueuefs: fix statfs report to not signal file system full Synthetic file systems that do not actually allocate file system blocks or inodes should report that they have space available and that they provide 0 inodes, in order to prevent capacity monitoring tools from warning about resource exhaustion. This has been fixed in all other synthetic file systems in base in commit 88a795e80c0, but this file was overlooked since its name does not indicate that it also provides a file system. MFC after: 1 month
|
#
f28526e9 |
|
19-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
kcmp(2): implement for generic file types Reviewed by: brooks, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43518
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
4d846d26 |
|
10-May-2023 |
Warner Losh <imp@FreeBSD.org> |
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
|
#
5b5b7e2c |
|
17-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: always retain path buffer after lookup This removes some of the complexity needed to maintain HASBUF and allows for removing injecting SAVENAME by filesystems. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D36542
|
#
f17ef286 |
|
22-Feb-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: rename fget*_locked to fget*_noref This gets rid of the error prone naming where fget_unlocked returns with a ref held, while fget_locked requires a lock but provides nothing in terms of making sure the file lives past unlock. No functional changes.
|
#
b214fcce |
|
13-Dec-2021 |
Alan Somers <asomers@FreeBSD.org> |
Change VOP_READDIR's cookies argument to a **uint64_t The cookies argument is only used by the NFS server. NFSv2 defines the cookie as 32 bits on the wire, but NFSv3 increased it to 64 bits. Our VOP_READDIR, however, has always defined it as u_long, which is 32 bits on some architectures. Change it to 64 bits on all architectures. This doesn't matter for any in-tree file systems, but it matters for some FUSE file systems that use 64-bit directory cookies. PR: 260375 Reviewed by: rmacklem Differential Revision: https://reviews.freebsd.org/D33404
|
#
2b68eb8e |
|
01-Oct-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove thread argument from VOP_STAT and fo_stat.
|
#
b4a58fbf |
|
01-Oct-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove cn_thread It is always curthread. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D32453
|
#
f7496dca |
|
21-Feb-2021 |
Jamie Gritton <jamie@FreeBSD.org> |
jail: Change the locking around pr_ref and pr_uref Require both the prison mutex and allprison_lock when pr_ref or pr_uref go to/from zero. Adding a non-first or removing a non-last reference remain lock-free. This means that a shared hold on allprison_lock is sufficient for prison_isalive() to be useful, which removes a number of cases of lock/check/unlock on the prison mutex. Expand the locking in kern_jail_set() to keep allprison_lock held exclusive until the new prison is valid, thus making invalid prisons invisible to any thread holding allprison_lock (except of course the one creating or destroying the prison). This renders prison_isvalid() nearly redundant, now used only in asserts. Differential Revision: https://reviews.freebsd.org/D28419 Differential Revision: https://reviews.freebsd.org/D28458
|
#
76ad42ab |
|
18-Jan-2021 |
Jamie Gritton <jamie@FreeBSD.org> |
jail: Add prison_isvalid() and prison_isalive() prison_isvalid() checks if a prison record can be used at all, i.e. pr_ref > 0. This filters out prisons that aren't fully created, and those that are either in the process of being dismantled, or will be at the next opportunity. While the check for pr_ref > 0 is simple enough to make without a convenience function, this prepares the way for other measures of prison validity. prison_isalive() checks not only validity as far as the useablity of the prison structure, but also whether the prison is visible to user space. It replaces a test for pr_uref > 0, which is currently only used within kern_jail.c, and not often there. Both of these functions also assert that either the prison mutex or allprison_lock is held, since it's generally the case that unlocked prisons aren't guaranteed to remain useable for any length of time. This isn't entirely true, for example a thread can assume its own prison is good, but most exceptions will exist inside of kern_jail.c.
|
#
25c2c952 |
|
17-Jan-2021 |
Jamie Gritton <jamie@FreeBSD.org> |
jail: Add proper prison locking in mqfs_prison_remove.
|
#
90f580b9 |
|
03-Jan-2021 |
Mark Johnston <markj@FreeBSD.org> |
Ensure that dirent's d_off field is initialized We have the d_off field in struct dirent for providing the seek offset of the next directory entry. Several filesystems were not initializing the field, which ends up being copied out to userland. Reported by: Syed Faraz Abrar <faraz@elttam.com> Reviewed by: kib MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27792
|
#
85078b85 |
|
17-Nov-2020 |
Conrad Meyer <cem@FreeBSD.org> |
Split out cwd/root/jail, cmask state from filedesc table No functional change intended. Tracking these structures separately for each proc enables future work to correctly emulate clone(2) in linux(4). __FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof. Reviewed by: kib Discussed with: markj, mjg Differential Revision: https://reviews.freebsd.org/D27037
|
#
6fed89b1 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
kern: clean up empty lines in .c and .h files
|
#
8f226f4c |
|
19-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove the always-curthread td argument from VOP_RECLAIM
|
#
a92a971b |
|
16-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove the thread argument from vget It was already asserted to be curthread. Semantic patch: @@ expression arg1, arg2, arg3; @@ - vget(arg1, arg2, arg3) + vget(arg1, arg2)
|
#
d292b194 |
|
05-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove the obsolete privused argument from vaccess This brings argument count down to 6, which is passable without the stack on amd64.
|
#
7029da5c |
|
26-Feb-2020 |
Pawel Biernacki <kaktus@FreeBSD.org> |
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718
|
#
42ec4f05 |
|
27-Jan-2020 |
Warner Losh <imp@FreeBSD.org> |
Make mqueue objects work across a fork again. In r110908 (2003) alfred added DFLAG_PASSABLE to tag those types of FD that can be passed via unix pipes, but mqueuefs didn't exist yet. Later, in r152825 (2005) davidxu neglected to include DFLAG_PASSABLE since people don't normally pass these things via unix sockets (it's a FreeBSD implementation detail that it's a file descriptor, nobody noticed). Then r223866 (2011) by jonathan used the new flag in fdcopy, which fork uses. Due to that, mqueuefs actually broke mqueue objects being propagated by fork. No mention of mqueuefs was made in r223866, so I think it was an unintended consequence. Fix this by tagging mqueuefs as passable as well. They were prior to alfred's change (and it's clear there's no intent in his change to change this behavior), and POSIX requires this to be the case as well. PR: 243103 Reviewed by: kib@, jiles@ Differential Revision: https://reviews.freebsd.org/D23038
|
#
b249ce48 |
|
03-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427
|
#
6fa079fc |
|
15-Dec-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: flatten vop vectors This eliminates the following loop from all VOP calls: while(vop != NULL && \ vop->vop_spare2 == NULL && vop->vop_bypass == NULL) vop = vop->vop_default; Reviewed by: jeff Tesetd by: pho Differential Revision: https://reviews.freebsd.org/D22738
|
#
f3719206 |
|
20-Aug-2019 |
Ed Maste <emaste@FreeBSD.org> |
mqueuefs: fix compat32 struct file leak In a compat32 error case we previously leaked a struct file. Submitted by: Karsten König, Secfault Security Security: CVE-2019-5603
|
#
051e692a |
|
23-Jul-2019 |
Ed Maste <emaste@FreeBSD.org> |
mqueuefs: fix struct file leak In some error cases we previously leaked a stuct file. Submitted by: mjg, markj
|
#
45d314c5 |
|
21-May-2019 |
Conrad Meyer <cem@FreeBSD.org> |
mqueuefs: Do not allow manipulation of the pseudo-dirents "." and ".." "." and ".." names are not maintained in the mqueuefs dirent datastructure and cannot be opened as mqueues. Creating or removing them is invalid; return EINVAL instead of crashing. PR: 236836 Submitted by: Torbjørn Birch Moltu <t.b.moltu AT lyse.net> Discussed with: jilles (earlier version)
|
#
cc426dd3 |
|
11-Dec-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
Remove unused argument to priv_check_cred. Patch mostly generated with cocinnelle: @@ expression E1,E2; @@ - priv_check_cred(E1,E2,0) + priv_check_cred(E1,E2) Sponsored by: The FreeBSD Foundation
|
#
6d2e2df7 |
|
23-Nov-2018 |
Mark Johnston <markj@FreeBSD.org> |
Ensure that directory entry padding bytes are zeroed. Directory entries must be padded to maintain alignment; in many filesystems the padding was not initialized, resulting in stack memory being copied out to userspace. With the ino64 work there are also some explicit pad fields in struct dirent. Add a subroutine to clear these bytes and use it in the in-tree filesystems. The NFS client is omitted for now as it was fixed separately in r340787. Reported by: Thomas Barabosch, Fraunhofer FKIE Reviewed by: kib MFC after: 3 days Sponsored by: The FreeBSD Foundation
|
#
f71ef9b6 |
|
06-Nov-2018 |
Mark Johnston <markj@FreeBSD.org> |
Use plain atomic_{add,subtract} when that's sufficient. CID: 1386920 MFC after: 2 weeks
|
#
6040822c |
|
30-Jul-2018 |
Alan Somers <asomers@FreeBSD.org> |
Make timespecadd(3) and friends public The timespecadd(3) family of macros were imported from NetBSD back in r35029. However, they were initially guarded by #ifdef _KERNEL. In the meantime, we have grown at least 28 syscalls that use timespecs in some way, leading many programs both inside and outside of the base system to redefine those macros. It's better just to make the definitions public. Our kernel currently defines two-argument versions of timespecadd and timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define three-argument versions. Solaris also defines a three-argument version, but only in its kernel. This revision changes our definition to match the common three-argument version. Bump _FreeBSD_version due to the breaking KPI change. Discussed with: cem, jilles, ian, bde Differential Revision: https://reviews.freebsd.org/D14725
|
#
4949ad72 |
|
18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
mqueue: avoid unused variables
|
#
cbd92ce6 |
|
09-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
Eliminate the overhead of gratuitous repeated reinitialization of cap_rights - Add macros to allow preinitialization of cap_rights_t. - Convert most commonly used code paths to use preinitialized cap_rights_t. A 3.6% speedup in fstat was measured with this change. Reported by: mjg Reviewed by: oshogbo Approved by: sbruno MFC after: 1 month
|
#
6469bdcd |
|
06-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Move most of the contents of opt_compat.h to opt_global.h. opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941
|
#
8a36da99 |
|
27-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/kern: adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.
|
#
5cead591 |
|
14-Jul-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Correct sysent flags for dynamically loaded syscalls. Using the https://github.com/google/capsicum-test/ suite, the PosixMqueue.CapModeForked test was failing due to an ECAPMODE after calling kmq_notify(). On further inspection, the dynamically loaded syscall entry was initialized with sy_flags zeroed out, since SYSCALL_INIT_HELPER() left sysent.sy_flags with the default value. Add a new helper SYSCALL{,32}_INIT_HELPER_F() which takes an additional argument to specify the sy_flags value. Submitted by: Siva Mahadevan <smahadevan@freebsdfoundation.org> Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D11576
|
#
15bcf785 |
|
31-Mar-2017 |
Robert Watson <rwatson@FreeBSD.org> |
Audit arguments to POSIX message queues, semaphores, and shared memory. This requires minor changes to the audit framework to allow capturing paths that are not filesystem paths (i.e., will not be canonicalised relative to the process current working directory and/or filesystem root). Obtained from: TrustedBSD Project MFC after: 3 weeks Sponsored by: DARPA, AFRL
|
#
8e31b510 |
|
17-Feb-2017 |
Bryan Drewery <bdrewery@FreeBSD.org> |
Fix panic with unlocked vnode to vrecycle(). MFC after: 2 weeks
|
#
ae44bb01 |
|
14-Nov-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Initialize reserved bytes in struct mq_attr and its 32compat counterpart, to avoid kernel stack content leak in kmq_setattr(2) syscall. Also slightly simplify the checks around copyout()s. Reported by: Vlad Tsyrklevich <vlad902+spam@gmail.com> PR: 214488 MFC after: 1 week
|
#
add14c83 |
|
24-Apr-2016 |
Jamie Gritton <jamie@FreeBSD.org> |
Use the new PR_METHOD_REMOVE to clean up jail handling in POSIX message queues.
|
#
44c16975 |
|
14-Apr-2016 |
Jamie Gritton <jamie@FreeBSD.org> |
Clean up some style(9) violations.
|
#
adb023ae |
|
13-Apr-2016 |
Jamie Gritton <jamie@FreeBSD.org> |
Separate POSIX mqueue objects in jails; actually, separate them by the jail's root, so jails that don't have their own filesystem directory also won't have their own mqueue namespace. PR: 208082
|
#
90f54cbf |
|
11-Apr-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: remove filedesc argument from fdclose Just accept a thread instead. This makes it consistent with fdalloc. No functional changes.
|
#
e015b1ab |
|
26-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Avoid dynamic syscall overhead for statically compiled modules. The kernel tracks syscall users so that modules can safely unregister them. But if the module is not unloadable or was compiled into the kernel, there is no need to do this. Achieve this by adding SY_THR_STATIC_KLD macro which expands to SY_THR_STATIC during kernel build and 0 otherwise. Reviewed by: kib (previous version) MFC after: 2 weeks
|
#
9696feeb |
|
22-Sep-2014 |
John Baldwin <jhb@FreeBSD.org> |
Add a new fo_fill_kinfo fileops method to add type-specific information to struct kinfo_file. - Move the various fill_*_info() methods out of kern_descrip.c and into the various file type implementations. - Rework the support for kinfo_ofile to generate a suitable kinfo_file object for each file and then convert that to a kinfo_ofile structure rather than keeping a second, different set of code that directly manipulates type-specific file information. - Remove the shm_path() and ksem_info() layering violations. Differential Revision: https://reviews.freebsd.org/D775 Reviewed by: kib, glebius (earlier version)
|
#
2d69d0dc |
|
12-Sep-2014 |
John Baldwin <jhb@FreeBSD.org> |
Fix various issues with invalid file operations: - Add invfo_rdwr() (for read and write), invfo_ioctl(), invfo_poll(), and invfo_kqfilter() for use by file types that do not support the respective operations. Home-grown versions of invfo_poll() were universally broken (they returned an errno value, invfo_poll() uses poll_no_poll() to return an appropriate event mask). Home-grown ioctl routines also tended to return an incorrect errno (invfo_ioctl returns ENOTTY). - Use the invfo_*() functions instead of local versions for unsupported file operations. - Reorder fileops members to match the order in the structure definition to make it easier to spot missing members. - Add several missing methods to linuxfileops used by the OFED shim layer: fo_write(), fo_truncate(), fo_kqfilter(), and fo_stat(). Most of these used invfo_*(), but a dummy fo_stat() implementation was added.
|
#
4a144410 |
|
16-Mar-2014 |
Robert Watson <rwatson@FreeBSD.org> |
Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. MFC after: 3 weeks
|
#
ed5848c8 |
|
15-Nov-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Replace CAP_POLL_EVENT and CAP_POST_EVENT capability rights (which I had a very hard time to fully understand) with much more intuitive rights: CAP_EVENT - when set on descriptor, the descriptor can be monitored with syscalls like select(2), poll(2), kevent(2). CAP_KQUEUE_EVENT - When set on a kqueue descriptor, the kevent(2) syscall can be called on this kqueue to with the eventlist argument set to non-NULL value; in other words the given kqueue descriptor can be used to monitor other descriptors. CAP_KQUEUE_CHANGE - When set on a kqueue descriptor, the kevent(2) syscall can be called on this kqueue to with the changelist argument set to non-NULL value; in other words it allows to modify events monitored with the given kqueue descriptor. Add alias CAP_KQUEUE, which allows for both CAP_KQUEUE_EVENT and CAP_KQUEUE_CHANGE. Add backward compatibility define CAP_POLL_EVENT which is equal to CAP_EVENT. Sponsored by: The FreeBSD Foundation MFC after: 3 days
|
#
2af0c790 |
|
05-Sep-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Fix !CAPABILITIES build.
|
#
7008be5b |
|
04-Sep-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...); bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation
|
#
6fbdb9f4 |
|
18-Aug-2013 |
Jilles Tjoelker <jilles@FreeBSD.org> |
Disallow opening a POSIX message queue for execute. O_EXEC was formerly ignored, so equivalent to O_RDONLY. Reject O_EXEC with [EINVAL] like the invalid mode 3.
|
#
ca04d21d |
|
15-Aug-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Make sendfile() a method in the struct fileops. Currently only vnode backed file descriptors have this method implemented. Reviewed by: kib Sponsored by: Nginx, Inc. Sponsored by: Netflix
|
#
058262c6 |
|
21-Jul-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Wrap kmq_notify(2) for compat32 to properly consume struct sigevent32 argument. Reviewed and tested by: Petr Salinger <Petr.Salinger@seznam.cz> Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
b68cf25f |
|
07-Apr-2013 |
Jilles Tjoelker <jilles@FreeBSD.org> |
mqueue,ksem,shm: Fix race condition with setting UF_EXCLOSE. POSIX mqueue, compatibility ksem and POSIX shm create a file descriptor that has close-on-exec set. However, they do this incorrectly, leaving a window where a thread may fork and exec while the flag has not been set yet. The race is easily reproduced on a multicore system with one thread doing shm_open and close and another thread doing posix_spawnp and waitpid. Set UF_EXCLOSE via falloc()'s flags argument instead. This also simplifies the code. MFC after: 1 week
|
#
2609222a |
|
01-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Merge Capsicum overhaul: - Capability is no longer separate descriptor type. Now every descriptor has set of its own capability rights. - The cap_new(2) system call is left, but it is no longer documented and should not be used in new code. - The new syscall cap_rights_limit(2) should be used instead of cap_new(2), which limits capability rights of the given descriptor without creating a new one. - The cap_getrights(2) syscall is renamed to cap_rights_get(2). - If CAP_IOCTL capability right is present we can further reduce allowed ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed ioctls can be retrived with cap_ioctls_get(2) syscall. - If CAP_FCNTL capability right is present we can further reduce fcntls that can be used with the new cap_fcntls_limit(2) syscall and retrive them with cap_fcntls_get(2). - To support ioctl and fcntl white-listing the filedesc structure was heavly modified. - The audit subsystem, kdump and procstat tools were updated to recognize new syscalls. - Capability rights were revised and eventhough I tried hard to provide backward API and ABI compatibility there are some incompatible changes that are described in detail below: CAP_CREATE old behaviour: - Allow for openat(2)+O_CREAT. - Allow for linkat(2). - Allow for symlinkat(2). CAP_CREATE new behaviour: - Allow for openat(2)+O_CREAT. Added CAP_LINKAT: - Allow for linkat(2). ABI: Reuses CAP_RMDIR bit. - Allow to be target for renameat(2). Added CAP_SYMLINKAT: - Allow for symlinkat(2). Removed CAP_DELETE. Old behaviour: - Allow for unlinkat(2) when removing non-directory object. - Allow to be source for renameat(2). Removed CAP_RMDIR. Old behaviour: - Allow for unlinkat(2) when removing directory. Added CAP_RENAMEAT: - Required for source directory for the renameat(2) syscall. Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR): - Allow for unlinkat(2) on any object. - Required if target of renameat(2) exists and will be removed by this call. Removed CAP_MAPEXEC. CAP_MMAP old behaviour: - Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and PROT_WRITE. CAP_MMAP new behaviour: - Allow for mmap(2)+PROT_NONE. Added CAP_MMAP_R: - Allow for mmap(PROT_READ). Added CAP_MMAP_W: - Allow for mmap(PROT_WRITE). Added CAP_MMAP_X: - Allow for mmap(PROT_EXEC). Added CAP_MMAP_RW: - Allow for mmap(PROT_READ | PROT_WRITE). Added CAP_MMAP_RX: - Allow for mmap(PROT_READ | PROT_EXEC). Added CAP_MMAP_WX: - Allow for mmap(PROT_WRITE | PROT_EXEC). Added CAP_MMAP_RWX: - Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC). Renamed CAP_MKDIR to CAP_MKDIRAT. Renamed CAP_MKFIFO to CAP_MKFIFOAT. Renamed CAP_MKNODE to CAP_MKNODEAT. CAP_READ old behaviour: - Allow pread(2). - Disallow read(2), readv(2) (if there is no CAP_SEEK). CAP_READ new behaviour: - Allow read(2), readv(2). - Disallow pread(2) (CAP_SEEK was also required). CAP_WRITE old behaviour: - Allow pwrite(2). - Disallow write(2), writev(2) (if there is no CAP_SEEK). CAP_WRITE new behaviour: - Allow write(2), writev(2). - Disallow pwrite(2) (CAP_SEEK was also required). Added convinient defines: #define CAP_PREAD (CAP_SEEK | CAP_READ) #define CAP_PWRITE (CAP_SEEK | CAP_WRITE) #define CAP_MMAP_R (CAP_MMAP | CAP_SEEK | CAP_READ) #define CAP_MMAP_W (CAP_MMAP | CAP_SEEK | CAP_WRITE) #define CAP_MMAP_X (CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL) #define CAP_MMAP_RW (CAP_MMAP_R | CAP_MMAP_W) #define CAP_MMAP_RX (CAP_MMAP_R | CAP_MMAP_X) #define CAP_MMAP_WX (CAP_MMAP_W | CAP_MMAP_X) #define CAP_MMAP_RWX (CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X) #define CAP_RECV CAP_READ #define CAP_SEND CAP_WRITE #define CAP_SOCK_CLIENT \ (CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \ CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN) #define CAP_SOCK_SERVER \ (CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \ CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \ CAP_SETSOCKOPT | CAP_SHUTDOWN) Added defines for backward API compatibility: #define CAP_MAPEXEC CAP_MMAP_X #define CAP_DELETE CAP_UNLINKAT #define CAP_MKDIR CAP_MKDIRAT #define CAP_RMDIR CAP_UNLINKAT #define CAP_MKFIFO CAP_MKFIFOAT #define CAP_MKNOD CAP_MKNODAT #define CAP_SOCK_ALL (CAP_SOCK_CLIENT | CAP_SOCK_SERVER) Sponsored by: The FreeBSD Foundation Reviewed by: Christoph Mallon <christoph.mallon@gmx.de> Many aspects discussed with: rwatson, benl, jonathan ABI compatibility discussed with: kib
|
#
bc2258da |
|
09-Nov-2012 |
Attilio Rao <attilio@FreeBSD.org> |
Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag. Porters should refer to __FreeBSD_version 1000021 for this change as it may have happened at the same timeframe.
|
#
28f865b0 |
|
25-Sep-2012 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Fix freebsd32_kmq_timedreceive() and freebsd32_kmq_timedsend() to use getmq_read() and getmq_write() respectively, just like sys_kmq_timedreceive() and sys_kmq_timedsend(). Sponsored by: FreeBSD Foundation MFC after: 2 weeks
|
#
af6e6b87 |
|
23-Apr-2012 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Remove unused thread argument to vrecycle(). Reviewed by: kib
|
#
dc15eac0 |
|
01-Jan-2012 |
Ed Schouten <ed@FreeBSD.org> |
Use strchr() and strrchr(). It seems strchr() and strrchr() are used more often than index() and rindex(). Therefore, simply migrate all kernel code to use it. For the XFS code, remove an empty line to make the code identical to the code in the Linux kernel.
|
#
6472ac3d |
|
07-Nov-2011 |
Ed Schouten <ed@FreeBSD.org> |
Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
|
#
8451d0dd |
|
16-Sep-2011 |
Kip Macy <kmacy@FreeBSD.org> |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
|
#
6aba400a |
|
25-Aug-2011 |
Attilio Rao <attilio@FreeBSD.org> |
Fix a deficiency in the selinfo interface: If a selinfo object is recorded (via selrecord()) and then it is quickly destroyed, with the waiters missing the opportunity to awake, at the next iteration they will find the selinfo object destroyed, causing a PF#. That happens because the selinfo interface has no way to drain the waiters before to destroy the registered selinfo object. Also this race is quite rare to get in practice, because it would require a selrecord(), a poll request by another thread and a quick destruction of the selrecord()'ed selinfo object. Fix this by adding the seldrain() routine which should be called before to destroy the selinfo objects (in order to avoid such case), and fix the present cases where it might have already been called. Sometimes, the context is safe enough to prevent this type of race, like it happens in device drivers which installs selinfo objects on poll callbacks. There, the destruction of the selinfo object happens at driver detach time, when all the filedescriptors should be already closed, thus there cannot be a race. For this case, mfi(4) device driver can be set as an example, as it implements a full correct logic for preventing this from happening. Sponsored by: Sandvine Incorporated Reported by: rstone Tested by: pluknet Reviewed by: jhb, kib Approved by: re (bz) MFC after: 3 weeks
|
#
9c00bb91 |
|
16-Aug-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Add the fo_chown and fo_chmod methods to struct fileops and use them to implement fchown(2) and fchmod(2) support for several file types that previously lacked it. Add MAC entries for chown/chmod done on posix shared memory and (old) in-kernel posix semaphores. Based on the submission by: glebius Reviewed by: rwatson Approved by: re (bz)
|
#
d1b6899e |
|
12-Aug-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Rename CAP_*_KEVENT to CAP_*_EVENT. Change the names of a couple of capability rights to be less FreeBSD-specific. Approved by: re (kib), mentor (rwatson) Sponsored by: Google Inc
|
#
a9d2f8d8 |
|
10-Aug-2011 |
Robert Watson <rwatson@FreeBSD.org> |
Second-to-last commit implementing Capsicum capabilities in the FreeBSD kernel for FreeBSD 9.0: Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op. Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions. In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit. Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent. Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
|
#
1fe80828 |
|
01-Apr-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
After the r219999 is merged to stable/8, rename fallocf(9) to falloc(9) and remove the falloc() version that lacks flag argument. This is done to reduce the KPI bloat. Requested by: jhb X-MFC-note: do not
|
#
de5b1952 |
|
25-Feb-2011 |
Alexander Leidinger <netchild@FreeBSD.org> |
Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/ PMC/SYSV/...). No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed. Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: arch@ (parts by rwatson, trasz, jhb) X-MFC after: to be determined in last commit with code from this project
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
cf7d9a8c |
|
08-Oct-2010 |
David Xu <davidxu@FreeBSD.org> |
Create a global thread hash table to speed up thread lookup, use rwlock to protect the table. In old code, thread lookup is done with process lock held, to find a thread, kernel has to iterate through process and thread list, this is quite inefficient. With this change, test shows in extreme case performance is dramatically improved. Earlier patch was reviewed by: jhb, julian
|
#
60ae52f7 |
|
21-Jun-2010 |
Ed Schouten <ed@FreeBSD.org> |
Use ISO C99 integer types in sys/kern where possible. There are only about 100 occurences of the BSD-specific u_int*_t datatypes in sys/kern. The ISO C99 integer types are used here more often.
|
#
048575ee |
|
07-Apr-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r205325: Implement compat32 shims for mqueuefs.
|
#
510ea843 |
|
28-Mar-2010 |
Ed Schouten <ed@FreeBSD.org> |
Rename st_*timespec fields to st_*tim for POSIX 2008 compliance. A nice thing about POSIX 2008 is that it finally standardizes a way to obtain file access/modification/change times in sub-second precision, namely using struct timespec, which we already have for a very long time. Unfortunately POSIX uses different names. This commit adds compatibility macros, so existing code should still build properly. Also change all source code in the kernel to work without any of the compatibility macros. This makes it all a less ambiguous. I am also renaming st_birthtime to st_birthtim, even though it was a local extension anyway. It seems Cygwin also has a st_birthtim.
|
#
afde2b65 |
|
19-Mar-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement compat32 shims for mqueuefs. Reviewed by: jhb MFC after: 2 weeks
|
#
e76d823b |
|
12-Sep-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Use C99 initialization for struct filterops. Obtained from: Mac OS X Sponsored by: Apple Inc. MFC after: 3 weeks
|
#
d8b0556c |
|
10-Jun-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. Use vnode interlock to protect the knote fields [1]. The locking assumes that shared vnode lock is held, thus we get exclusive access to knote either by exclusive vnode lock protection, or by shared vnode lock + vnode interlock. Do not use kl_locked() method to assert either lock ownership or the fact that curthread does not own the lock. For shared locks, ownership is not recorded, e.g. VOP_ISLOCKED can return LK_SHARED for the shared lock not owned by curthread, causing false positives in kqueue subsystem assertions about knlist lock. Remove kl_locked method from knlist lock vector, and add two separate assertion methods kl_assert_locked and kl_assert_unlocked, that are supposed to use proper asserts. Change knlist_init accordingly. Add convenience function knlist_init_mtx to reduce number of arguments for typical knlist initialization. Submitted by: jhb [1] Noted by: jhb [2] Reviewed by: jhb Tested by: rnoland
|
#
d6da6408 |
|
10-Jun-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix r193923 by noting that type of a_fp is struct file *, not int. It was assumed that r193923 was trivial change that cannot be done wrong. MFC after: 2 weeks
|
#
e4d9bdc1 |
|
10-Jun-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
s/a_fdidx/a_fp/ for VOP_OPEN comments that inline struct vop_open_args definition. Discussed with: bde MFC after: 2 weeks
|
#
dfd233ed |
|
11-May-2009 |
Attilio Rao <attilio@FreeBSD.org> |
Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.
|
#
1cbae705 |
|
28-Nov-2008 |
Ed Schouten <ed@FreeBSD.org> |
Fix matching of message queues by name. The mqfs_search() routine uses strncmp() to match message queue objects by name. This is because it can be called from environments where the file name is not null terminated (the VFS for example). Unfortunately it doesn't compare the lengths of the message queue names, which means if a system has "Queue12345", the name "Queue" will also match. I noticed this when a student of mine handed in an exercise using message queues with names "Queue2" and "Queue". Reviewed by: rink
|
#
15bc6b2b |
|
28-Oct-2008 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)
|
#
1ede983c |
|
23-Oct-2008 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months
|
#
d7f03759 |
|
19-Oct-2008 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Import the HEAD csup code which is the basis for the cvsmode work.
|
#
caf8aec8 |
|
20-Sep-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
fdescfs, devfs, mqueuefs, nfs, portalfs, pseudofs, tmpfs and xfs initialize the vattr structure in VOP_GETATTR() with VATTR_NULL(), vattr_null() or by zeroing it. Remove these to allow preinitialization of fields work in vn_stat(). This is needed to get birthtime initialized correctly. Submitted by: Jaakko Heinonen <jh saunalahti fi> Discussed on: freebsd-fs MFC after: 1 month
|
#
4c5a20e3 |
|
20-Sep-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Initialize va_rdev to NODEV instead of 0 or VNOVAL in VOP_GETATTR(). NODEV is more appropriate when va_rdev doesn't have a meaningful value. Submitted by: Jaakko Heinonen <jh saunalahti fi> Suggested by: bde Discussed on: freebsd-fs MFC after: 1 month
|
#
86dacdfe |
|
20-Sep-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Initialize va_flags and va_filerev properly in VOP_GETATTR(). Don't initialize va_vaflags and va_spare because they are not part of the VOP_GETATTR() API. Also don't initialize birthtime to ctime or zero. Submitted by: Jaakko Heinonen <jh saunalahti fi> Reviewed by: bde Discussed on: freebsd-fs MFC after: 1 month
|
#
fbc48e97 |
|
05-Sep-2008 |
David Xu <davidxu@FreeBSD.org> |
Fix LOR between vnode lock and internal mqueue locks.
|
#
b042e976 |
|
04-Sep-2008 |
David Xu <davidxu@FreeBSD.org> |
Fix lock name conflict. PR: kern/127040
|
#
0359a12e |
|
28-Aug-2008 |
Attilio Rao <attilio@FreeBSD.org> |
Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
#
069c6953 |
|
29-Mar-2008 |
Jeff Roberson <jeff@FreeBSD.org> |
- Use vget() to lock the vnode rather than refing without a lock and locking in separate steps.
|
#
22db15c0 |
|
13-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
#
cb05b60a |
|
09-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
#
e4650294 |
|
07-Jan-2008 |
John Baldwin <jhb@FreeBSD.org> |
Make ftruncate a 'struct file' operation rather than a vnode operation. This makes it possible to support ftruncate() on non-vnode file types in the future. - 'struct fileops' grows a 'fo_truncate' method to handle an ftruncate() on a given file descriptor. - ftruncate() moves to kern/sys_generic.c and now just fetches a file object and invokes fo_truncate(). - The vnode-specific portions of ftruncate() move to vn_truncate() in vfs_vnops.c which implements fo_truncate() for vnode file types. - Non-vnode file types return EINVAL in their fo_truncate() method. Submitted by: rwatson
|
#
397c19d1 |
|
29-Dec-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove explicit locking of struct file. - Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them. Tested by: kris, pho
|
#
32f9753c |
|
11-Jun-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project
|
#
302e130e |
|
23-May-2007 |
Olivier Houchard <cognet@FreeBSD.org> |
Remove duplicate includes. Submitted by: Cyril Nguyen Huu <cyril ci0 org>
|
#
94b94b2b |
|
11-Apr-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Remove now-obsolete comment regarding mqueue privileges in jail.
|
#
9956b3f5 |
|
10-Apr-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Do allow POSIX mqueue unlink privilege inside a jail, as we all all other POSIX mqueue privileges inside a jail.
|
#
5e3f7694 |
|
04-Apr-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff
|
#
61b9d89f |
|
12-Mar-2007 |
Tor Egge <tegge@FreeBSD.org> |
Make insmntque() externally visibile and allow it to fail (e.g. during late stages of unmount). On failure, the vnode is recycled. Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure. Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE. Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility. Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup. Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib
|
#
873fbcd7 |
|
05-Mar-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Further system call comment cleanup: - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments.
|
#
6aeb05d7 |
|
11-Nov-2006 |
Tom Rhodes <trhodes@FreeBSD.org> |
Merge posix4/* into normal kernel hierarchy. Reviewed by: glanced at by jhb Approved by: silence on -arch@ and -standards@
|
#
acd3428b |
|
06-Nov-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
#
5da56ddb |
|
25-Sep-2006 |
Tor Egge <tegge@FreeBSD.org> |
Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag. This eliminates a race where MNT_UPDATE flag could be lost when nmount() raced against sync(), sync_fsync() or quotactl().
|
#
0f180a7c |
|
17-Apr-2006 |
John Baldwin <jhb@FreeBSD.org> |
Change msleep() and tsleep() to not alter the calling thread's priority if the specified priority is zero. This avoids a race where the calling thread could read a snapshot of it's current priority, then a different thread could change the first thread's priority, then the original thread would call sched_prio() inside msleep() undoing the change made by the second thread. I used a priority of zero as no thread that calls msleep() or tsleep() should be specifying a priority of zero anyway. The various places that passed 'curthread->td_priority' or some variant as the priority now pass 0.
|
#
61d3a4ef |
|
28-Feb-2006 |
David Xu <davidxu@FreeBSD.org> |
Let kernel POSIX timer code and mqueue code to use integer as a resource handle, the timer_t and mqd_t types will be a pointer which userland will define it.
|
#
ba0360b1 |
|
21-Feb-2006 |
David Xu <davidxu@FreeBSD.org> |
Abstract function mqfs_create_node() to create a mqueue node.
|
#
03f70aec |
|
16-Dec-2005 |
David Xu <davidxu@FreeBSD.org> |
Replace selwakeuppri with selwakeup, let scheduler figure out appropriate thread priority.
|
#
dd1a6f53 |
|
11-Dec-2005 |
David Xu <davidxu@FreeBSD.org> |
Stop fiddling thread priority with msleep, eliminating unnecessary context switching. This improves performance about 30% on UP machine.
|
#
102178d0 |
|
08-Dec-2005 |
David Xu <davidxu@FreeBSD.org> |
Comment out mqfs_create_link. Inline some small functions.
|
#
9da8a32a |
|
05-Dec-2005 |
David Xu <davidxu@FreeBSD.org> |
o Turn on MPSAFE flag for mqueuefs. o Reuse si_mqd field in siginfo_t, this also gives userland information about which descriptor is notified.
|
#
052ea11c |
|
04-Dec-2005 |
David Xu <davidxu@FreeBSD.org> |
After reading some documents, I realized SIGEV_NONE != NULL, also fix code in mqueue_send_notification to handle SIGEV_NONE.
|
#
9947b459 |
|
04-Dec-2005 |
David Xu <davidxu@FreeBSD.org> |
Handle SIGEV_NONE, if notification is SIGEV_NONE, error status and return status will be set, but no notification will be registered. Increase hard limit of maxmsg to 100, so posixtestsuite ports can run.
|
#
5ee2d4ac |
|
02-Dec-2005 |
David Xu <davidxu@FreeBSD.org> |
1. Cleanup including. 2. Set configuration value for CTL_P1003_1B_MESSAGE_PASSING.
|
#
a6de716d |
|
02-Dec-2005 |
David Xu <davidxu@FreeBSD.org> |
1. Check if message priority is less than MQ_PRIO_MAX. 2. Use getnanotime instead of getnanouptime. 3. Don't free message in _mqueue_send, mqueue_send will free it.
|
#
b2f92ef9 |
|
29-Nov-2005 |
David Xu <davidxu@FreeBSD.org> |
Last step to make mq_notify conform to POSIX standard, If the process has successfully attached a notification request to the message queue via a queue descriptor, file closing should remove the attachment.
|
#
f72b11a4 |
|
27-Nov-2005 |
David Xu <davidxu@FreeBSD.org> |
Fix a stupid compiler warining, remove a redundant line.
|
#
47bf2cf9 |
|
27-Nov-2005 |
David Xu <davidxu@FreeBSD.org> |
Change filesystem name from mqueue to mqueuefs for style consistent. Suggested by: rwatson
|
#
655291f2 |
|
25-Nov-2005 |
David Xu <davidxu@FreeBSD.org> |
Bring in experimental kernel support for POSIX message queue.
|