#
6b7e4254 |
|
21-May-2024 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
capsicum: allow rfork(2) in capability mode Reviewed by: brooks, rwatson MFC after: 4 days Differential Revision: https://reviews.freebsd.org/D45040
|
#
050555e1 |
|
13-May-2024 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
syscalls.master: allow vfork(2) in capsicum(4) capability mode There is no reason not do do this, we already allow fork(2), and I need vfork(2) for CHERI process colocation. Reviewed by: brooks, emaste, oshogbo MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D39829
|
#
78101d43 |
|
24-Apr-2024 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls.master: correct return type of {read,write}v This was missed when read/write, etc were updated to return ssize_t. Fixes: 2e83b2816183 Fix a few syscall arguments to use size_t instead of u_int. Reviewed by: imp, kib Differential Revision: https://reviews.freebsd.org/D44930
|
#
27676ae3 |
|
19-Mar-2024 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls.master: use __acl_type_t Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44418
|
#
d0efabdf |
|
19-Mar-2024 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls.master: make __sys_fcntl take an intptr_t The (optional) third argument of fcntl is sometimes a pointer so change the type to intptr_t. Update the libc-internal defintion (actually used by libthr) to take a fixed intptr_t argument rather than pretending it's a variadic function. (That worked because all supported architectures pass variadic arguments as though the function was declared with those types. In CheriBSD that changes because variadic arguments are passed via a bounded array.) Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44381
|
#
cab73e53 |
|
18-Mar-2024 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls.master: struct siginfo -> struct __siginfo struct siginfo doesn't exist, it's struct __siginfo (and siginfo_t). Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44380
|
#
7936d4e4 |
|
19-Mar-2024 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls.master: align with sigfastblock declaration sigfastblock is declared to take a void * argument in the manpage in headers so declare it that way and use SAL annotations to say it interacts with a 32-bit word. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44379
|
#
d8d4ed26 |
|
19-Mar-2024 |
Brooks Davis <brooks@FreeBSD.org> |
syscall.master: fix aio_suspend signature It takes a `const struct iovec *iovp`. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44378
|
#
128443a9 |
|
19-Mar-2024 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls.master: fix readv and writev iovp decl Both take const struct iovec * and only read the values. Reviewed by: olce, kib Differential Revision: https://reviews.freebsd.org/D44377
|
#
d6d4183c |
|
01-Feb-2024 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls.master: Remove stray blank lines No functional change.
|
#
d8decc9a |
|
19-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
Add kcmp(2) kernel bits This is based purely on reading the Linux kcmp(2) man page. In addition to the Linux set of comparators, I also added KCMP_FILEOBJ to compare underlying file' objects. Tested by: manu Reviewed by: brooks, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43518
|
#
7893419d |
|
04-Dec-2023 |
Brooks Davis <brooks@FreeBSD.org> |
Remove never implemented sbrk and sstk syscalls Both system calls were stubs returning EOPNOTSUPP and libc did not provide _ or __sys_ prefixed symbols. The actual implementation of sbrk(2) is on top of the undocumented break(2) system call. Technically this is a change in ABI, but no non-contrived program ever called these syscalls. Reviewed by: kib, emaste Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D42872
|
#
5b31cc94 |
|
23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sccs: Manual changes For the uncommon items: Go through the tree and remove sccs tags that didn't fit any nice pattern. If in the neighborhood, other SCM tags were removed when they were detritis of long-ago CVS somehow in the early mists of the project. Some adjacent copyrights stringswere removed (they duplicated the copyright notices in the file). This also removed non-standard formations of omission of SCCS tags (usually by adding an extra #if 0 somewhere. After this commit, a number of strings tagged with the 'what' @(#) prefix remain, but they are primarily copyright notices. Sponsored by: Netflix
|
#
84d12f88 |
|
06-Oct-2023 |
Kristof Provost <kp@FreeBSD.org> |
Add a COMPAT_FREEBSD14 kernel option Use it wherever COMPAT_FREEBSD13 is currently specified. Reviewed by: brooks, zlei Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D42100
|
#
5e29272b |
|
24-Sep-2023 |
Haoyu Gu <guhaoyu2005@gmail.com> |
syscalls.master: Fix SAL annotation for getdirentires basep argument getdirentires last argument "off_t *basep" is an optional output argument. It returns the value only when the passed-in value(pointer) is non-NULL. This is a part of the research work at RCSLab, University of Waterloo. Reviewed by: imp, emaste Differential Revision: https://reviews.freebsd.org/D41969
|
#
af93fea7 |
|
23-Aug-2023 |
Jake Freeland <jfree@freebsd.org> |
timerfd: Move implementation from linux compat to sys/kern Move the timerfd impelemntation from linux compat code to sys/kern. Use it to implement the new system calls for timerfd. Add a hook to kern_tc to allow timerfd to know when the system time has stepped. Add kqueue support to timerfd. Adjust a few names to be less Linux centric. RelNotes: YES Reviewed by: markj (on irc), imp, kib (with reservations), jhb (slack) Differential Revision: https://reviews.freebsd.org/D38459
|
#
4a69fc16 |
|
07-Oct-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Add membarrier(2) This is an attempt at clean-room implementation of the Linux' membarrier(2) syscall. For documentation, you would need to read both membarrier(2) Linux man page, the comments in Linux kernel/sched/membarrier.c implementation and possibly look at actual uses. Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32360
|
#
78d14616 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line bare tag Remove /^\s*\$FreeBSD\$$\n/
|
#
602b575a |
|
20-Apr-2023 |
Warner Losh <imp@FreeBSD.org> |
syscall.master: Remove stray 4.2 Back in 4.3BSD, the system call table wasn't generated, and there was an entry: "4.2 sigreturn", /* 139 = old 4.2 sigreturn */ This got converted to 139 OBSOL 0 4.2 sigreturn in 4.3 RENO. Since it was obsolete, nothing bad happened. In fact, there was code in makeyscalls.sh to cope: { comment = $4 for (i = 5; i <= NF; i++) comment = comment " " $i if (NF < 5) $5 = $4 } so the generated comment in syscalls.c was almost correct: "obs_4.2", /* 139 = obsolete 4.2 sigreturn */ a bug that we have to this very day, despite makesyscalls.sh being rewritten in lua. However, this historical wart is the only place in our current syscalls.master file where we have an extra field for the 'not generated' class of system calls. Remove the historical wart so that the re-write of makesyscalls.lua can be simpler (so, I hope, qemu's bsd-user can large swathes of code automatically generated too). This should help make things more understandable (changes to simplify makesyscalls.lue aren't quite debugged, so have to wait for another day). There's 3 different obsolete sigreturns (but only 1 that was ever in FreeBSD 2.x and newer). Sponsored by: Netflix
|
#
559b94a1 |
|
20-Apr-2023 |
Warner Losh <imp@FreeBSD.org> |
syscall.master: Fix comments Have more accruate comments. While #if, #else, etc are copied to the header files, lines that don't start with # are not. And #include files are only output to sysinc (which winds up at the front of init_sysent.c which seems a bit odd). This is all radically undocumented, and likely has drifted somewhat from 4.4BSD and what other systems do (they've drifted too, fwiw). Sponsored by: Netflix
|
#
dac31024 |
|
31-Mar-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
Rename kqueue1(2) to kqueuex(2) to avoid compat issues with NetBSD Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39377
|
#
61194e98 |
|
25-Mar-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
Add kqueue1() syscall It takes the flags argument. Immediate use is to provide the KQUEUE_CLOEXEC flag for kqueue(2). Reviewed by: emaste, jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39271
|
#
52a1d90c |
|
13-Apr-2022 |
Ed Maste <emaste@FreeBSD.org> |
Allow posix_fadvise in capability mode posix_fadvise operates only on a provided fd. Noted by Mathieu <sigsys@gmail.com> in review D34761. No new CAP_ rights are added for posix_fadvise(), as 'advice' in general only influences when I/O happens; the fd must have existing CAP_ rights for actual data access. Reviewed by: markj MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34903
|
#
e5821a21 |
|
30-Mar-2022 |
Ed Maste <emaste@FreeBSD.org> |
syscalls.master: remove obsolete comment about compatibility tables Compatibility ABIs no longer use a separate syscalls.master. Fixes: be67ea40c5a0 ("freebsd32: generate from ...") Sponsored by: The FreeBSD Foundation
|
#
53465702 |
|
08-Dec-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
swapoff: add one more variant of the syscall Requested and reviewed by: brooks Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33343
|
#
c1a84727 |
|
08-Dec-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
syscalls: add COMPAT13 Reviewed by: brooks Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33343
|
#
6d37a167 |
|
29-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: mprotect does not take a const The mprotect syscall decleration is not const. I added this one incorrectly in a944d28d0edf7ceb1bef4d789dfa4e8e18331658. Reported by: kib Reviewed by: kib, imp
|
#
a8efd4d1 |
|
29-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: make syscall and __syscall SYSMUX Rather than combining the declearation of nosys with the registration of SYS_syscall, declare syscall(2) and __syscall(2) with the new SYSMUX type in syscalls.master and declare nosys directly. This eliminates the last use of syscall aliases in the tree. Reviewed by: kib, imp
|
#
d7f306c5 |
|
29-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
makesyscalls: add a new SYSMUX type This type is for system call multiplexers (syscall(2), __syscall(2)) that don't have a normal handler and instead are handled in the machine-dependent syscall code. Reviewed by: kib, imp
|
#
cffb55f0 |
|
29-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: normalize exit Declare the exit system call normally. This results in the implementation being named sys_exit rather than sys_sys_exit and being decalred as returning an int. Infact it does not return at all because exit1 does not, so add an __unreachable() to let the compiler know that. Reviewed by: kib, imp
|
#
638c5fa8 |
|
29-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: normalize (get|set)rlimit Declare normal <foo>_args structs rather than going out of the way to declare __<foo>_args. Reviewed by: kib, imp
|
#
ba4e5253 |
|
29-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: normalize orecvfrom and ogetsockname Declare o<foo>_args rather than reusing the equivalent <foo>_args structs. Avoiding the addition of a new type isn't worth the gratutious differences. Reviewed by: kib, imp
|
#
3660e76a |
|
29-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: correct a couple style issues Reviewed by: kib, imp
|
#
33f9ea20 |
|
29-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: add missing SAL annotations freebsd7_shmctl was missing an annotation Reviewed by: kib, imp
|
#
be67ea40 |
|
22-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
freebsd32: generate from sys/kern/syscalls.master This avoids the need to keep a freebsd32-specific syscalls.master in sync with the default ABI. As evidenced by the number of commits required to sync the two, it is extremely easy for them to get out of sync due to misunderstandings and user errors. Reviewed by: kevans, kib
|
#
799ce8b8 |
|
22-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: annotate args pointing to long, pointer, or time_t Add _Contains_ annotations indicating that the data pointed to by a pointer argument contains types that vary between FreeBSD ABIs. The supported set is long (including size_t), pointer (including intptr_t), and time_t. The first two vary between 32- and 64-bit ABIs. The laste betwen i386 and everything else. These will be used to detect which syscalls require handling on particular ABIs. Reviewed by: kevans, kib
|
#
00e0a4c0 |
|
22-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: abort2 doesn't return so declare as void Reviewed by: kib
|
#
4b2e1f14 |
|
22-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: umask returns a mode_t Reviewed by: kib
|
#
27f5b514 |
|
22-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: update a few return types to ssize_t Reviewed by: kib
|
#
717e7fb2 |
|
22-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: struct ucontext4 -> struct freebsd4_ucontext This aligns with struct freebsd4_ucontext32 in freebsd32. Reviewed by: kib
|
#
d8bd949b |
|
22-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
sys___sysctl: regularize argument struct Let makesyscalls generate the normal struct __sysctl_args structure. It works fine. Reviewed by: kib
|
#
88dfcfa2 |
|
22-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
sys_sigaltstack: use struct sigaltstack arg This is idential to stack_t and more amenable to prepending "32" to for freebsd32. Reviewed by: kib
|
#
85d1d2a6 |
|
17-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: use struct siginfo rather than siginfo_t This allows freebsd32 to use struct siginfo32 with an automatable conversion. Reviewed by: kevans
|
#
f5032882 |
|
17-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: fix type of osendmsg osendmsg takes an struct omsghdr * not a void *. Reviewed by: kevans
|
#
2385f4d1 |
|
17-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: use __socklen_t as appropriate No functional change as __socklen_t is an int. Obtained from: CheriBSD Reviewed by: kevans
|
#
b64f3dc2 |
|
17-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: [gs]etitimer takes an int which Match the function decleration which takes an int not a signed int. No functional change as the range of valid values is 0-2. Obtained from: CheriBSD Reviewed by: kevans
|
#
b7fd8611 |
|
17-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: sprinkle in const values Add missing const qualifiers to a number of syscall arguments. Obtained from: CheriBSD Reviewed by: kevans
|
#
8e4a3add |
|
15-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
struct kevent_freebsd11 -> struct freebsd11_kevent Rename to match the naming of syscalls and allow 32 to be appended without making an ugly name like kevent_freebsd1132. While here, make the kevent changelist argument const. Reviewed by: kib
|
#
f0da2a14 |
|
15-Nov-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls: unwrap a long line Style dictates that each variable is on a single line Reviewed by: kib
|
#
77b2c2f8 |
|
22-Oct-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Add sched_getcpu() for compatibility with Linux. Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32901
|
#
6bc90e8a |
|
01-Sep-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls.master: correct formatting issues Reviewed by: kevans, emaste MFC after: 1 week Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D31351
|
#
df501bac |
|
01-Sep-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls.master: switch to CAPENABLED flags Switch the main syscall table to use CAPENABLED flags rather than capabilities.conf. This avoid synchronization issues between syscalls.master and capabilities.conf (e.g. when renaming a syscall during development). For now, move capabilities.conf to sys/compat/freebsd32 and use it there. Use of sys/compat/freebsd32/syscalls.master should be replaced by makesyscalls.lua enhancements to allow the main one to be used. This change results in no changes to generated files after running `make sysent`. Reviewed by: kevans, emaste MFC after: 1 week Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D31350
|
#
6945df3f |
|
01-Sep-2021 |
Brooks Davis <brooks@FreeBSD.org> |
makesyscalls.lua: add a CAPENABLED flag The CAPENABLED flag indicates that the syscall can be used in capsicum capability mode. It is intended to replace capabilities.conf. Reviewed by: kevans, emaste MFC after: 1 week Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D31349
|
#
0dc332bf |
|
05-Aug-2021 |
Ka Ho Ng <khng@FreeBSD.org> |
Add fspacectl(2), vn_deallocate(9) and VOP_DEALLOCATE(9). fspacectl(2) is a system call to provide space management support to userspace applications. VOP_DEALLOCATE(9) is a VOP call to perform the deallocation. vn_deallocate(9) is a public KPI for kmods' use. The purpose of proposing a new system call, a KPI and a VOP call is to allow bhyve or other hypervisor monitors to emulate the behavior of SCSI UNMAP/NVMe DEALLOCATE on a plain file. fspacectl(2) comprises of cmd and flags parameters to specify the space management operation to be performed. Currently cmd has to be SPACECTL_DEALLOC, and flags has to be 0. fo_fspacectl is added to fileops. VOP_DEALLOCATE(9) is added as a new VOP call. A trivial implementation of VOP_DEALLOCATE(9) is provided. Sponsored by: The FreeBSD Foundation Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D28347
|
#
9b6b793b |
|
19-Jul-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Revert most of ce42e793100b460f597e4c85ec0da12e274f9394 to restore ABI compatibility for pre-10.x binaries. It restores _umtx_lock() and _umtx_unlock() syscalls, and UMTX_OP_LOCK/ UMTX_OP_UNLOCK umtx_op(2) operations. UMUTEX_ERROR_CHECK flag is left out for now, I do not think it makes a difference. PR: 218571 Reviewed by: brooks (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31220
|
#
4bc2174a |
|
09-Jul-2019 |
Moritz Buhl <gh@moritzbuhl.de> |
kern: fail getgroup and setgroup with negative int Found using https://github.com/NetBSD/src/blob/trunk/tests/lib/libc/sys/t_getgroups.c getgroups/setgroups want an int and therefore casting it to u_int resulted in `getgroups(-1, ...)` not returning -1 / errno = EINVAL. imp@ updated syscall.master and made changes markj@ suggested PR: 189941 Tested by: imp@ Reviewed by: markj@ Pull Request: https://github.com/freebsd/freebsd-src/pull/407 Differential Revision: https://reviews.freebsd.org/D30617
|
#
d89c1c46 |
|
26-Jan-2021 |
Brooks Davis <brooks@FreeBSD.org> |
Reserve gaps in syscall numbers for local use It is best for auditing of syscalls.master if we only append to the file. Reserving unimplemented system call numbers for local use makes this policy and provides a large set of syscall numbers FreeBSD derivatives can use without risk of conflict. Reviewed by: jhb, kevans, kib Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D27988
|
#
119fa6ee |
|
26-Jan-2021 |
Brooks Davis <brooks@FreeBSD.org> |
syscalls.master: Add a new syscall type: RESERVED RESERVED syscall number are reserved for local/vendor use. RESERVED is identical to UNIMPL except that comments are ignored. Reviewed by: kevans Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D27988
|
#
65a524b4 |
|
26-Jan-2021 |
Brooks Davis <brooks@FreeBSD.org> |
Remove documentation of unimplemented syscalls We have not been able to run binaries from other BSDs well over a decade. There is no need to document their allocation decisions here. We also don't need to reserve syscall numbers of never-implemented syscalls. Reviewed by: jhb, kib Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D27988
|
#
b3286afa |
|
05-Jan-2021 |
Alan Somers <asomers@FreeBSD.org> |
Reallocate syscall numbers for aio_writev and aio_readv The originally chosen numbers interfere with downstream projects' syscalls. Move them to the end of the syscall table instead. Reported by: jrtc27 Reviewed by: brooks MFC-With: 022ca2fc7fe08d51f33a1d23a9be49e6d132914e Differential Revision: 022ca2fc7fe08d51f33a1d23a9be49e6d132914e
|
#
022ca2fc |
|
02-Jan-2021 |
Alan Somers <asomers@FreeBSD.org> |
Add aio_writev and aio_readv POSIX AIO is great, but it lacks vectored I/O functions. This commit fixes that shortcoming by adding aio_writev and aio_readv. They aren't part of the standard, but they're an obvious extension. They work just like their synchronous equivalents pwritev and preadv. It isn't yet possible to use vectored aiocbs with lio_listio, but that could be added in the future. Reviewed by: jhb, kib, bcr Relnotes: yes Differential Revision: https://reviews.freebsd.org/D27743
|
#
7a202823 |
|
23-Dec-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Expose eventfd in the native API/ABI using a new __specialfd syscall eventfd is a Linux system call that produces special file descriptors for event notification. When porting Linux software, it is currently usually emulated by epoll-shim on top of kqueues. Unfortunately, kqueues are not passable between processes. And, as noted by the author of epoll-shim, even if they were, the library state would also have to be passed somehow. This came up when debugging strange HW video decode failures in Firefox. A native implementation would avoid these problems and help with porting Linux software. Since we now already have an eventfd implementation in the kernel (for the Linuxulator), it's pretty easy to expose it natively, which is what this patch does. Submitted by: greg@unrelenting.technology Reviewed by: markj (previous version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D26668
|
#
d9021e38 |
|
28-May-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add a syscall for the nfs-over-tls daemons to use. The nfs-over-tls daemons need a system call to perform operations such as associate a file descriptor with a krpc socket. The daemons will not be in head for some time, but it will make it easier for testers of nfs-over-tls to do testing if the system call is in head (basically the stub for libc which will be commited soon). Reviewed by: brooks Differential Revision: https://reviews.freebsd.org/D24949
|
#
3e6b8291 |
|
23-Apr-2020 |
Kyle Evans <kevans@FreeBSD.org> |
close_range(2): use newly assigned AUE_CLOSERANGE
|
#
7d03e081 |
|
14-Apr-2020 |
Kyle Evans <kevans@FreeBSD.org> |
Mark closefrom(2) COMPAT12, reimplement in libc to wrap close_range Include a temporarily compatibility shim as well for kernels predating close_range, since closefrom is used in some critical areas. Reviewed by: markj (previous version), kib Differential Revision: https://reviews.freebsd.org/D24399
|
#
472ced39 |
|
12-Apr-2020 |
Kyle Evans <kevans@FreeBSD.org> |
Implement a close_range(2) syscall close_range(min, max, flags) allows for a range of descriptors to be closed. The Python folk have indicated that they would much prefer this interface to closefrom(2), as the case may be that they/someone have special fds dup'd to higher in the range and they can't necessarily closefrom(min) because they don't want to hit the upper range, but relocating them to lower isn't necessarily feasible. sys_closefrom has been rewritten to use kern_close_range() using ~0U to indicate closing to the end of the range. This was chosen rather than requiring callers of kern_close_range() to hold FILEDESC_SLOCK across the call to kern_close_range for simplicity. The flags argument of close_range(2) is currently unused, so any flags set is currently EINVAL. It was added to the interface in Linux so that future flags could be added for, e.g., "halt on first error" and things of this nature. This patch is based on a syscall of the same design that is expected to be merged into Linux. Reviewed by: kib, markj, vangyzen (all slightly earlier revisions) Differential Revision: https://reviews.freebsd.org/D21627
|
#
0573d0a9 |
|
20-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add realpathat syscall realpath(3) is used a lot e.g., by clang and is a major source of getcwd and fstatat calls. This can be done more efficiently in the kernel. This works by performing a regular lookup while saving the name and found parent directory. If the terminal vnode is a directory we can resolve it using usual means. Otherwise we can use the name saved by lookup and resolve the parent. See the review for sample syscall counts. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23574
|
#
146fc63f |
|
09-Feb-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Add a way to manage thread signal mask using shared word, instead of syscall. A new syscall sigfastblock(2) is added which registers a uint32_t variable as containing the count of blocks for signal delivery. Its content is read by kernel on each syscall entry and on AST processing, non-zero count of blocks is interpreted same as the signal mask blocking all signals. The biggest downside of the feature that I see is that memory corruption that affects the registered fast sigblock location, would cause quite strange application misbehavior. For instance, the process would be immune to ^C (but killable by SIGKILL). With consumers (rtld and libthr added), benchmarks do not show a slow-down of the syscalls in micro-measurements, and macro benchmarks like buildworld do not demonstrate a difference. Part of the reason is that buildworld time is dominated by compiler, and clang already links to libthr. On the other hand, small utilities typically used by shell scripts have the total number of syscalls cut by half. The syscall is not exported from the stable libc version namespace on purpose. It is intended to be used only by our C runtime implementation internals. Tested by: pho Disscussed with: cem, emaste, jilles Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D12773
|
#
2d5603fe |
|
18-Nov-2019 |
David Bright <dab@FreeBSD.org> |
Jail and capability mode for shm_rename; add audit support for shm_rename Co-mingling two things here: * Addressing some feedback from Konstantin and Kyle re: jail, capability mode, and a few other things * Adding audit support as promised. The audit support change includes a partial refresh of OpenBSM from upstream, where the change to add shm_rename has already been accepted. Matthew doesn't plan to work on refreshing anything else to support audit for those new event types. Submitted by: Matthew Bryan <matthew.bryan@isilon.com> Reviewed by: kib Relnotes: Yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22083
|
#
96c914ee |
|
14-Nov-2019 |
Brooks Davis <brooks@FreeBSD.org> |
Tidy syscall declerations. Pointer arguments should be of the form "<type> *..." and not "<type>* ...". No functional change. Reviewed by: kevans Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D22373
|
#
f403831e |
|
01-Oct-2019 |
Ed Maste <emaste@FreeBSD.org> |
sysalls.master: remove superfluous ellipsis in comment A single period is sufficient in this comment, and making this change lets us find references to varargs syscalls by searching for ...
|
#
11fd6a60 |
|
30-Sep-2019 |
Kyle Evans <kevans@FreeBSD.org> |
syscalls.master: consistency, move ); to newline (no functional change)
|
#
9afb12ba |
|
26-Sep-2019 |
David Bright <dab@FreeBSD.org> |
Add an shm_rename syscall Add an atomic shm rename operation, similar in spirit to a file rename. Atomically unlink an shm from a source path and link it to a destination path. If an existing shm is linked at the destination path, unlink it as part of the same atomic operation. The caller needs the same permissions as shm_unlink to the shm being renamed, and the same permissions for the shm at the destination which is being unlinked, if it exists. If those fail, EACCES is returned, as with the other shm_* syscalls. truss support is included; audit support will come later. This commit includes only the implementation; the sysent-generated bits will come in a follow-on commit. Submitted by: Matthew Bryan <matthew.bryan@isilon.com> Reviewed by: jilles (earlier revision) Reviewed by: brueffer (manpages, earlier revision) Relnotes: yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D21423
|
#
234879a7 |
|
25-Sep-2019 |
Kyle Evans <kevans@FreeBSD.org> |
Mark shm_open(2) as COMPAT12, succeeded by shm_open2 Implementation and regenerated files will follow.
|
#
20f70576 |
|
25-Sep-2019 |
Kyle Evans <kevans@FreeBSD.org> |
Add a shm_open2 syscall to support upcoming memfd_create shm_open2 allows a little more flexibility than the original shm_open. shm_open2 doesn't enforce CLOEXEC on its callers, and it has a separate shmflag argument that can be expanded later. Currently the only shmflag is to allow file sealing on the returned fd. shm_open and memfd_create will both be implemented in libc to use this new syscall. __FreeBSD_version is bumped to indicate the presence. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21393
|
#
85c5f3cb |
|
25-Sep-2019 |
Kyle Evans <kevans@FreeBSD.org> |
Add COMPAT12 support to makesyscalls.sh Reviewed by: kib, imp, brooks (all without syscalls.master edits) Differential Revision: https://reviews.freebsd.org/D21366
|
#
d05b53e0 |
|
02-Sep-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
Add sysctlbyname system call Previously userspace would issue one syscall to resolve the sysctl and then another one to actually use it. Do it all in one trip. Fallback is provided in case newer libc happens to be running on an older kernel. Submitted by: Pawel Biernacki Reported by: kib, brooks Differential Revision: https://reviews.freebsd.org/D17282
|
#
bbbbeca3 |
|
24-Jul-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add kernel support for a Linux compatible copy_file_range(2) syscall. This patch adds support to the kernel for a Linux compatible copy_file_range(2) syscall and the related VOP_COPY_FILE_RANGE(9). This syscall/VOP can be used by the NFSv4.2 client to implement the Copy operation against an NFSv4.2 server to do file copies locally on the server. The vn_generic_copy_file_range() function in this patch can be used by the NFSv4.2 server to implement the Copy operation. Fuse may also me able to use the VOP_COPY_FILE_RANGE() method. vn_generic_copy_file_range() attempts to maintain holes in the output file in the range to be copied, but may fail to do so if the input and output files are on different file systems with different _PC_MIN_HOLE_SIZE values. Separate commits will be done for the generated syscall files and userland changes. A commit for a compat32 syscall will be done later. Reviewed by: kib, asomers (plus comments by brooks, jilles) Relnotes: yes Differential Revision: https://reviews.freebsd.org/D20584
|
#
e8ee7d90 |
|
16-Apr-2019 |
Ed Maste <emaste@FreeBSD.org> |
correct readlinkat(2) return type r176215 corrected readlink(2)'s return type and the type of the last argument. readlink(2) was introduced in r177788 after being developed as part of Google Summer of Code 2007; it appears to have inherited the wrong return type. Man pages and header files were already ssize_t; update syscalls.master to match. PR: 197915 Submitted by: Henning Petersen <henning.petersen@t-online.de> MFC after: 2 weeks
|
#
a1304030 |
|
06-Apr-2019 |
Mariusz Zaborski <oshogbo@FreeBSD.org> |
Introduce funlinkat syscall that always us to check if we are removing the file associated with the given file descriptor. Reviewed by: kib, asomers Reviewed by: cem, jilles, brooks (they reviewed previous version) Discussed with: pjd, and many others Differential Revision: https://reviews.freebsd.org/D14567
|
#
10f7b12c |
|
17-Dec-2018 |
Brooks Davis <brooks@FreeBSD.org> |
const poison the `new` pointer of __sysctl. Reviewed by: kib Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D18444
|
#
d1fd400a |
|
07-Dec-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Add new file handle system calls. Namely, getfhat(2), fhlink(2), fhlinkat(2), fhreadlink(2). The syscalls are provided for a NFS userspace server (nfs-ganesha). Submitted by: Jack Halford <jack@gandi.net> Sponsored by: Gandi.net Tested by: pho Feedback from: brooks, markj MFC after: 1 week Differential revision: https://reviews.freebsd.org/D18359
|
#
41f7b253 |
|
04-Dec-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Remove NOARGS from oaccept. This was in the orignal patch, but lost in a rebase. Reported by: andrew Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15816
|
#
d48719bd |
|
04-Dec-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Normalize COMPAT_43 syscall declarations. Have ogetkerninfo, ogetpagesize, ogethostname, osethostname, and oaccept declare o<foo>_args structs rather than non-compat ones. Due to a failure to use NOARGS in most cases this adds only one new declaration. No changes required in freebsd32 as only ogetpagesize() is implemented and it has a 32-bit specific implementation. Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15816
|
#
9a38df59 |
|
09-Nov-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Fix freebsd32 mknod(at). As dev_t is now a 64-bit integer, it requires special handling as a system call argument. 64-bit arguments are split between two 64-bit integers due to the way arguments are promoted to allow reuse of most system call implementations. They must be reassembled before use. Further, 64-bit arguments at an odd offset (counting from zero) are padded and slid to the next slot on powerpc and mips. Fix the non-COMPAT11 system call by adding a freebsd32_mknodat() and appropriately padded declerations. The COMPAT11 system calls are fully compatible with the 64-bit implementations so remove the freebsd32_ versions. Use uint32_t consistently as the type of the old dev_t. This matches the old definition. Reviewed by: kib MFC after: 3 days Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17928
|
#
e56ec0e5 |
|
07-Nov-2018 |
Brooks Davis <brooks@FreeBSD.org> |
makesyscalls.sh: allow pointer return types. The previous code required that the return type be a single word. This allows it to be a pointer without using a typedef. Update the return types of break, mmap, and shmat to be void * as declared. This only effects systrace output in-tree, but can aid in generating system call wrappers from syscalls.master. Reviewed by: kib Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17873
|
#
dd4d2f21 |
|
06-Nov-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Update some comments made obsolete by recent commits.
|
#
318f0d77 |
|
06-Nov-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Use declared types for caddr_t arguments. Leave ptrace(2) alone for the moment as it's defined to take a caddr_t. Reviewed by: kib Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17852
|
#
44cbc1c2 |
|
05-Nov-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Fix a couple indentation errors in r339958.
|
#
12e69f96 |
|
02-Nov-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Add const to input-only char * arguments. These arguments are mostly paths handled by NAMEI*() macros which already take const char * arguments. This change improves the match between syscalls.master and the public declerations of system calls. Reviewed by: kib (prior version) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17812
|
#
2105ac07 |
|
01-Nov-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Use mode_t when the documented signature does. This is more clear and produces better results when generating function stubs from syscalls.master. Reviewed by: kib, emaste Obtained from: CheribSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17784
|
#
e3e54813 |
|
31-Oct-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Reformat syscalls.master for better readability. This takes advantage of two recents changes to makesyscalls.sh: r328598: Permit a range of syscall numbers for UNIMPL r339624: Remove the need for backslashes in syscalls.master Syscall declerations are now split across multiple lines with the syscall name and variables each on seperate lines (with an exception for syscalls taking no arguments.) Reviewed by: imp Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17706
|
#
22c0c9a4 |
|
22-Oct-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Remove __restrict qualifiers from syscalls.master. The restruct qualifier is intended to aid code generation in the compiler, but the only access to storage through these pointers is via structs using copyin/copyout and the like which can not be written in C or C++ and thus the compiler gains nothing from the qualifiers. As such, the qualifiers add no value in current usage. Reviewed by: kib Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17574
|
#
29bf3a7b |
|
15-Oct-2018 |
Kyle Evans <kevans@FreeBSD.org> |
Correct COMPAT* macro names in syscalls.master Both ^/sys/compat/freebsd32/syscalls.master and ^/sys/kern/syscalls.master cited "COMPAT[n] #ifdef" instead of "COMPAT_FREEBSD[n] #ifdef" in places. Approved by: re (glebius)
|
#
46e20549 |
|
28-Sep-2018 |
John Baldwin <jhb@FreeBSD.org> |
Mark various removed system calls as OBSOL instead of UNIMPL. This is mostly a cosmetic change except that obsolete system calls are assigned meaningful names in the names arrays which means that using tools like kdump or truss against binaries invoking these system calls will print out the name instead of the number. The script I use to generate the XML list of syscalls for GDB also ignores UNIMPL but not OBSOL entries. In general UNIMPL should only be used to reserve placeholders for system calls that have never been implemented while system calls that existed at one time in FreeBSD but were removed should be marked OBSOL instead. Reviewed by: brooks, kib, imp Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D17344
|
#
c542c43e |
|
16-Aug-2018 |
Jamie Gritton <jamie@FreeBSD.org> |
Revert r337922, except for some documention-only bits. This needs to wait until user is changed to stop using jail(2). Differential Revision: D14791
|
#
284001a2 |
|
16-Aug-2018 |
Jamie Gritton <jamie@FreeBSD.org> |
Put jail(2) under COMPAT_FREEBSD11. It has been the "old" way of creating jails since FreeBSD 7. Along with the system call, put the various security.jail.allow_foo and security.jail.foo_allowed sysctls partly under COMPAT_FREEBSD11 (or BURN_BRIDGES). These sysctls had two disparate uses: on the system side, they were global permissions for jails created via jail(2) which lacked fine-grained permission controls; inside a jail, they're read-only descriptions of what the current jail is allowed to do. The first use is obsolete along with jail(2), but keep them for the second-read-only use. Differential Revision: D14791
|
#
7cc923f8 |
|
10-Jul-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Get rid of netbsd_lchown and netbsd_msync syscall entries. No valid FreeBSD binary very called them (they would call lchown and msync directly) and we haven't supported NetBSD binaries in ages. This is a respin of r335983 with a workaround for the ancient BFD linker in the libc stubs. Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D16193
|
#
714c03c8 |
|
05-Jul-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Revert r335983. The bfd linker in tree doesn't support multiple names for the same symbol (at least with current flags).
|
#
5b04a71d |
|
05-Jul-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Get rid of netbsd_lchown and netbsd_msync syscall entries. No valid FreeBSD binary ever called them (they would call lchown and msync directly) and we haven't supported NetBSD binaries in ages. Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15814
|
#
9da5364e |
|
14-Jun-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Name the implementation of brk and sbrk sys_break(). The break() system call was renamed (several times) starting in v3 AT&T UNIX when C was invented and break was a language keyword. The last vestage of a need for it to be called something else (eg obreak) was removed in r225617 which consistantly prefixed all syscall implementations. Reviewed by: emaste, kib (older version) Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15638
|
#
9f9c9b22 |
|
04-Jun-2018 |
Mark Johnston <markj@FreeBSD.org> |
Reimplement brk() and sbrk() to avoid the use of _end. Previously, libc.so would initialize its notion of the break address using _end, a special symbol emitted by the static linker following the bss section. Compatibility issues between lld and ld.bfd could cause the wrong definition of _end (libc.so's definition rather than that of the executable) to be used, breaking the brk()/sbrk() interface. Avoid this problem and future interoperability issues by simply not relying on _end. Instead, modify the break() system call to return the kernel's view of the current break address, and have libc initialize its state using an extra syscall upon the first use of the interface. As a side effect, this appears to fix brk()/sbrk() usage in executables run with rtld direct exec, since the kernel and libc.so no longer maintain separate views of the process' break address. PR: 228574 Reviewed by: kib (previous version) MFC after: 2 months Differential Revision: https://reviews.freebsd.org/D15663
|
#
64b378f1 |
|
30-May-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Remove alternative names that are identical to the default. Verified by make sysent producing no changes.
|
#
7351a8bd |
|
25-May-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Make vadvise compat freebsd11. The vadvise syscall (aka ovadvise) is undocumented and has always been implmented as returning EINVAL. Put the syscall under COMPAT11 and provide a userspace implementation. Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15557
|
#
89ea4a30 |
|
05-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Added SAL annotatations to system calls. Modify makesyscalls.sh to strip out SAL annotations. No functional change. This is based on work I started in CheriBSD and use to validate fat pointers at the syscall boundary. Tal Garfinkel reviewed the changes, added annotations to COMPAT* syscalls and is using them in a record and playback framework. One can envision other uses such as a WITNESS-like validator for copyin/out as speculated on in the review. As this time we are only annotating sys/kern/syscalls.master as that is sufficient for userspace work. If kernel use cases materialize, we can annotate other syscalls.master as needed. Submitted by: Tal Garfinkel <talg@cs.stanford.edu> Sponsored by: DARPA, AFRL (in part) Differential Revision: https://reviews.freebsd.org/D14285
|
#
e9ac2743 |
|
20-Mar-2018 |
Conrad Meyer <cem@FreeBSD.org> |
Implement getrandom(2) and getentropy(3) The general idea here is to provide userspace programs with well-defined sources of entropy, in a fashion that doesn't require opening a new file descriptor (ulimits) or accessing paths (/dev/urandom may be restricted by chroot or capsicum). getrandom(2) is the more general API, and comes from the Linux world. Since our urandom and random devices are identical, the GRND_RANDOM flag is ignored. getentropy(3) is added as a compatibility shim for the OpenBSD API. truss(1) support is included. Tests for both system calls are provided. Coverage is believed to be at least as comprehensive as LTP getrandom(2) test coverage. Additionally, instructions for running the LTP tests directly against FreeBSD are provided in the "Test Plan" section of the Differential revision linked below. (They pass, of course.) PR: 194204 Reported by: David CARLIER <david.carlier AT hardenedbsd.org> Discussed with: cperciva, delphij, jhb, markj Relnotes: maybe Differential Revision: https://reviews.freebsd.org/D14500
|
#
1c1b4c66 |
|
05-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Remove remenants of 1990s efforts to let us run Net/OpenBSD binaries. No functional change (comments change in some generated files.) Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14571
|
#
315fbaec |
|
23-Feb-2018 |
Ed Maste <emaste@FreeBSD.org> |
Correct pseudo misspelling in sys/ comments contrib code and #define in intel_ata.h unchanged.
|
#
3f289c3f |
|
12-Jan-2018 |
Jeff Roberson <jeff@FreeBSD.org> |
Implement 'domainset', a cpuset based NUMA policy mechanism. This allows userspace to control NUMA policy administratively and programmatically. Implement domainset based iterators in the page layer. Remove the now legacy numa_* syscalls. Cleanup some header polution created by having seq.h in proc.h. Reviewed by: markj, kib Discussed with: alc Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D13403
|
#
5cd667e6 |
|
28-Nov-2017 |
Brooks Davis <brooks@FreeBSD.org> |
Disable vim syntax highlighting. Vim's default pick doesn't understand that ';' is a comment character and the result looks horrible. Reviewed by: emaste
|
#
2b34e843 |
|
16-Jun-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Add abstime kqueue(2) timers and expand struct kevent members. This change implements NOTE_ABSTIME flag for EVFILT_TIMER, which specifies that the data field contains absolute time to fire the event. To make this useful, data member of the struct kevent must be extended to 64bit. Using the opportunity, I also added ext members. This changes struct kevent almost to Apple struct kevent64, except I did not changed type of ident and udata, the later would cause serious API incompatibilities. The type of ident was kept uintptr_t since EVFILT_AIO returns a pointer in this field, and e.g. CHERI is sensitive to the type (discussed with brooks, jhb). Unlike Apple kevent64, symbol versioning allows us to claim ABI compatibility and still name the new syscall kevent(2). Compat shims are provided for both host native and compat32. Requested by: bapt Reviewed by: bapt, brooks, ngie (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D11025
|
#
69921123 |
|
23-May-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Commit the 64-bit inode project. Extend the ino_t, dev_t, nlink_t types to 64-bit ints. Modify struct dirent layout to add d_off, increase the size of d_fileno to 64-bits, increase the size of d_namlen to 16-bits, and change the required alignment. Increase struct statfs f_mntfromname[] and f_mntonname[] array length MNAMELEN to 1024. ABI breakage is mitigated by providing compatibility using versioned symbols, ingenious use of the existing padding in structures, and by employing other tricks. Unfortunately, not everything can be fixed, especially outside the base system. For instance, third-party APIs which pass struct stat around are broken in backward and forward incompatible ways. Kinfo sysctl MIBs ABI is changed in backward-compatible way, but there is no general mechanism to handle other sysctl MIBS which return structures where the layout has changed. It was considered that the breakage is either in the management interfaces, where we usually allow ABI slip, or is not important. Struct xvnode changed layout, no compat shims are provided. For struct xtty, dev_t tty device member was reduced to uint32_t. It was decided that keeping ABI compat in this case is more useful than reporting 64-bit dev_t, for the sake of pstat. Update note: strictly follow the instructions in UPDATING. Build and install the new kernel with COMPAT_FREEBSD11 option enabled, then reboot, and only then install new world. Credits: The 64-bit inode project, also known as ino64, started life many years ago as a project by Gleb Kurtsou (gleb). Kirk McKusick (mckusick) then picked up and updated the patch, and acted as a flag-waver. Feedback, suggestions, and discussions were carried by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles), and Rick Macklem (rmacklem). Kris Moore (kris) performed an initial ports investigation followed by an exp-run by Antoine Brodin (antoine). Essential and all-embracing testing was done by Peter Holm (pho). The heavy lifting of coordinating all these efforts and bringing the project to completion were done by Konstantin Belousov (kib). Sponsored by: The FreeBSD Foundation (emaste, kib) Differential revision: https://reviews.freebsd.org/D10439
|
#
982519d1 |
|
06-Apr-2017 |
Brooks Davis <brooks@FreeBSD.org> |
Change the size argument of __getcwd() to size_t. This matches the getcwd() definition. This is technically an ABI change, but that would only effect 64-bit big-endian platforms that pass arguments on the stack. We have none of those. Reviewed by: jhb Obtained from: CheriABI Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D9428
|
#
d8ca0a2b |
|
29-Mar-2017 |
Robert Watson <rwatson@FreeBSD.org> |
Hook up new audit event identifiers for various non-Orange Book/CAPP system calls supported by OpenBSM 1.2-alpha5. Obtained from: TrustedBSD Project MFC after: 3 weeks Sponsored by: DARPA, AFRL
|
#
3f8455b0 |
|
18-Mar-2017 |
Eric van Gyzen <vangyzen@FreeBSD.org> |
Add clock_nanosleep() Add a clock_nanosleep() syscall, as specified by POSIX. Make nanosleep() a wrapper around it. Attach the clock_nanosleep test from NetBSD. Adjust it for the FreeBSD behavior of updating rmtp only when interrupted by a signal. I believe this to be POSIX-compliant, since POSIX mentions the rmtp parameter only in the paragraph about EINTR. This is also what Linux does. (NetBSD updates rmtp unconditionally.) Copy the whole nanosleep.2 man page from NetBSD because it is complete and closely resembles the POSIX description. Edit, polish, and reword it a bit, being sure to keep any relevant text from the FreeBSD page. Reviewed by: kib, ngie, jilles MFC after: 3 weeks Relnotes: yes Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D10020
|
#
34ed0c63 |
|
27-Dec-2016 |
John Baldwin <jhb@FreeBSD.org> |
Rename the 'flags' argument to getfsstat() to 'mode' and validate it. This argument is not a bitmask of flags, but only accepts a single value. Fail with EINVAL if an invalid value is passed to 'flag'. Rename the 'flags' argument to getmntinfo(3) to 'mode' as well to match. This is a followup to r308088. Reviewed by: kib MFC after: 1 month
|
#
82d8d2b8 |
|
07-Dec-2016 |
Robert Watson <rwatson@FreeBSD.org> |
Replace spaces with tabs in definition of SCTP system calls, for consistency with the remainder of the syscalls.master file. This problem does not occur in the freebsd32 version of the same system calls.
|
#
5cba398b |
|
18-Aug-2016 |
George V. Neville-Neil <gnn@FreeBSD.org> |
Remove unusedd and obsolete openbsd_poll system call. (Phase 1) Reported by: brooks Reviewed by: brooks,jhb Differential Revision: https://reviews.freebsd.org/D7548
|
#
295af703 |
|
15-Aug-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Add an implementation of fdatasync(2). The syscall is a trivial wrapper around new VOP_FDATASYNC(), sharing code with fsync(2). For all filesystems, this commit provides the implementation which delegates the work of VOP_FDATASYNC() to VOP_FSYNC(). This is functionally correct but not efficient. This is not yet POSIX-compliant implementation, because it does not ensure that queued AIO requests are completed before returning. Reviewed by: mckusick Discussed with: avg (ZFS), jhb (AIO part) Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D7471
|
#
78be18ae |
|
03-Aug-2016 |
Bryan Drewery <bdrewery@FreeBSD.org> |
Correct some comments. Sponsored by: EMC / Isilon Storage Division MFC after: 3 days
|
#
1b42875d |
|
03-Aug-2016 |
Ed Schouten <ed@FreeBSD.org> |
Re-add traling slash that was removed in r303699. I must have accidentally pressed some random key in vim.
|
#
a813fdc6 |
|
03-Aug-2016 |
Ed Schouten <ed@FreeBSD.org> |
mprotect(): Change prototype to comply to POSIX. Our mprotect() function seems to take a "const void *" address to the pages whose permissions need to be adjusted. POSIX uses "void *". Simply stick to the POSIX one to prevent us from writing unportable code. PR: 211423 (exp-run) Tested by: antoine@ (Thanks!)
|
#
d9c4cd2f |
|
27-Jul-2016 |
Ed Schouten <ed@FreeBSD.org> |
Change the return type of msgrcv() to ssize_t as required by POSIX. It looks like the msgrcv() system call is already written in such a way that the size is internally computed as a size_t and written into all of td_retval[0]. This means that it is effectively already returning ssize_t. It's just that the userspace prototype doesn't match up.
|
#
e5ec7339 |
|
10-Jul-2016 |
Robert Watson <rwatson@FreeBSD.org> |
Do allow auditing of read(2) and write(2) system calls, by assigning those system calls audit event identifiers AUE_READ and AUE_WRITE. While auditing file-descriptor I/O is not required by the Common Criteria, in practice this proves useful for both live and forensic analysis. NB: freebsd32 already assigns AUE_READ and AUE_WRITE to read(2) and write(2). MFC after: 3 days Sponsored by: DARPA, AFRL
|
#
e16e6409 |
|
22-Jun-2016 |
Brooks Davis <brooks@FreeBSD.org> |
Mark the pipe() system call as COMPAT10. As of r302092 libc uses pipe2() with a zero flags value instead of pipe(). Commit with regenerated files and implementation to follow. Approved by: re (gjb) Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D6816
|
#
bb430bc7 |
|
21-Mar-2016 |
John Baldwin <jhb@FreeBSD.org> |
Fully handle size_t lengths in AIO requests. First, update the return types of aio_return() and aio_waitcomplete() to ssize_t. POSIX requires aio_return() to return a ssize_t so that it can represent all return values from read() and write(). aio_waitcomplete() should use ssize_t for the same reason. aio_return() has used ssize_t in <aio.h> since r31620 but the manpage and system call entry were not updated. aio_waitcomplete() has always returned int. Note that this does not require new system call stubs as this is effectively only an API change in how the compiler interprets the return value. Second, allow aio_nbytes values up to IOSIZE_MAX instead of just INT_MAX. aio_read/write should now honor the same length limits as normal read/write. Third, use longs instead of ints in the aio_return() and aio_waitcomplete() system call functions so that the 64-bit size_t in the in-kernel aiocb isn't truncated to 32-bits before being copied out to userland or being returned. Finally, a simple test has been added to verify the bounds checking on the maximum read size from a file.
|
#
399e8c17 |
|
09-Mar-2016 |
John Baldwin <jhb@FreeBSD.org> |
Simplify AIO initialization now that it is standard. - Mark AIO system calls as STD and remove the helpers to dynamically register them. - Use COMPAT6 for the old system calls with the older sigevent instead of an 'o' prefix. - Simplify the POSIX configuration to note that AIO is always available. - Handle AIO in the default VOP_PATHCONF instead of special casing it in the pathconf() system call. fpathconf() is still hackish. - Remove freebsd32_aio_cancel() as it just called the native one directly. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D5589
|
#
871ef8b0 |
|
11-Jul-2015 |
Adrian Chadd <adrian@FreeBSD.org> |
Regenerate syscalls.
|
#
0538aafc |
|
18-Apr-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
The lseek(2), mmap(2), truncate(2), ftruncate(2), pread(2), and pwrite(2) syscalls are wrapped to provide compatibility with pre-7.x kernels which required padding before the off_t parameter. The fcntl(2) contains compatibility code to handle kernels before the struct flock was changed during the 8.x CURRENT development. The shims were reasonable to allow easier revert to the older kernel at that time. Now, two or three major releases later, shims do not serve any purpose. Such old kernels cannot handle current libc, so revert the compatibility code. Make padded syscalls support conditional under the COMPAT6 config option. For COMPAT32, the syscalls were under COMPAT6 already. Remove WITHOUT_SYSCALL_COMPAT build option, which only purpose was to (partially) disable the removed shims. Reviewed by: jhb, imp (previous versions) Discussed with: peter Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
2205e0d1 |
|
23-Jan-2015 |
Jilles Tjoelker <jilles@FreeBSD.org> |
Add futimens and utimensat system calls. The core kernel part is patch file utimes.2008.4.diff from pluknet@FreeBSD.org. I updated the code for API changes, added the manual page and added compatibility code for old kernels. There is also audit and Capsicum support. A new UTIME_* constant might allow setting birthtimes in future. Differential Revision: https://reviews.freebsd.org/D1426 Submitted by: pluknet (partially) Reviewed by: delphij, pluknet, rwatson Relnotes: yes
|
#
9f7a06f2 |
|
04-Jan-2015 |
Dmitry Chagin <dchagin@FreeBSD.org> |
Indeed, instead of hiding the kern___getcwd() bug by bogus cast in r276564, change path type to char * (pathnames are always char *). And remove bogus casts of malloc(). kern___getcwd() internally doesn't actually use or support u_char * paths, except to copy them to a normal char * path. These changes are not visible to libc as libc/gen/getcwd.c misdeclares __getcwd() as taking a plain char * path. While here remove _SYS_SYSPROTO_H_ for __getcwd() syscall as we always have sysproto.h. Pointed out by: bde MFC after: 1 week
|
#
186d9c34 |
|
12-Nov-2014 |
Dmitry Chagin <dchagin@FreeBSD.org> |
Add the ppoll() system call. Export kern_poll() needed by an upcoming Linuxulator change. Differential Revision: https://reviews.freebsd.org/D1133 Reviewed by: kib, wblock MFC after: 1 month
|
#
80b47aef |
|
09-Oct-2014 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Move the SCTP syscalls to netinet with the rest of the SCTP code. The syscalls themselves are tightly coupled with the network stack and therefore should not be in the generic socket code. The following four syscalls have been marked as NOSTD so they can be dynamically registered in sctp_syscalls_init() function: sys_sctp_peeloff sys_sctp_generic_sendmsg sys_sctp_generic_sendmsg_iov sys_sctp_generic_recvmsg The syscalls are also set up to be dynamically registered when COMPAT32 option is configured. As a side effect of moving the SCTP syscalls, getsock_cap needs to be made available outside of the uipc_syscalls.c source file. A proper prototype has been added to the sys/socketvar.h header file. API tests from the SCTP reference implementation have been run to ensure compatibility. (http://code.google.com/p/sctp-refimpl/source/checkout) Submitted by: Steve Kiernan <stevek@juniper.net> Reviewed by: tuexen, rrs Obtained from: Juniper Networks, Inc.
|
#
ce42e793 |
|
18-Mar-2014 |
Attilio Rao <attilio@FreeBSD.org> |
Remove dead code from umtx support: - Retire long time unused (basically always unused) sys__umtx_lock() and sys__umtx_unlock() syscalls - struct umtx and their supporting definitions - UMUTEX_ERROR_CHECK flag - Retire UMTX_OP_LOCK/UMTX_OP_UNLOCK from _umtx_op() syscall __FreeBSD_version is not bumped yet because it is expected that further breakages to the umtx interface will follow up in the next days. However there will be a final bump when necessary. Sponsored by: EMC / Isilon storage division Reviewed by: jhb
|
#
55648840 |
|
19-Sep-2013 |
John Baldwin <jhb@FreeBSD.org> |
Extend the support for exempting processes from being killed when swap is exhausted. - Add a new protect(1) command that can be used to set or revoke protection from arbitrary processes. Similar to ktrace it can apply a change to all existing descendants of a process as well as future descendants. - Add a new procctl(2) system call that provides a generic interface for control operations on processes (as opposed to the debugger-specific operations provided by ptrace(2)). procctl(2) uses a combination of idtype_t and an id to identify the set of processes on which to operate similar to wait6(). - Add a PROC_SPROTECT control operation to manage the protection status of a set of processes. MADV_PROTECT still works for backwards compatability. - Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc) the first bit of which is used to track if P_PROTECT should be inherited by new child processes. Reviewed by: kib, jilles (earlier version) Approved by: re (delphij) MFC after: 1 month
|
#
84c21af1 |
|
12-Sep-2013 |
John Baldwin <jhb@FreeBSD.org> |
Fix the type of the idtype argument to wait6() in syscalls.master. Approved by: re (kib) MFC after: 1 week
|
#
7008be5b |
|
04-Sep-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...); bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation
|
#
6160e12c |
|
08-Jun-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Add new system call - aio_mlock(). The name speaks for itself. It allows to perform the mlock(2) operation, which can consume a lot of time, under control of aio(4). Reviewed by: kib, jilles Sponsored by: Nginx, Inc.
|
#
48947ecc |
|
21-May-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix the wait6(2) on 32bit architectures and for the compat32, by using the right type for the argument in syscalls.master. Also fix the posix_fallocate(2) and posix_fadvise(2) compat32 syscalls on the architectures which require padding of the 64bit argument. Noted and reviewed by: jhb Pointy hat to: kib MFC after: 1 week
|
#
dc570d5e |
|
01-May-2013 |
Jilles Tjoelker <jilles@FreeBSD.org> |
Add pipe2() system call. The pipe2() function is similar to pipe() but allows setting FD_CLOEXEC and O_NONBLOCK (on both sides) as part of the function. If p points to two writable ints, pipe2(p, 0) is equivalent to pipe(p). If the pointer is not valid, behaviour differs: pipe2() writes into the array from the kernel like socketpair() does, while pipe() writes into the array from an architecture-specific assembler wrapper. Reviewed by: kan, kib
|
#
da7d2afb |
|
01-May-2013 |
Jilles Tjoelker <jilles@FreeBSD.org> |
Add accept4() system call. The accept4() function, compared to accept(), allows setting the new file descriptor atomically close-on-exec and explicitly controlling the non-blocking status on the new socket. (Note that the latter point means that accept() is not equivalent to any form of accept4().) The linuxulator's accept4 implementation leaves a race window where the new file descriptor is not close-on-exec because it calls sys_accept(). This implementation leaves no such race window (by using falloc() flags). The linuxulator could be fixed and simplified by using the new code. Like accept(), accept4() is async-signal-safe, a cancellation point and permitted in capability mode.
|
#
e324bf91 |
|
01-Apr-2013 |
Matthew D Fleming <mdf@FreeBSD.org> |
Fix return type of extattr_set_* and fix rmextattr(8) utility. extattr_set_{fd,file,link} is logically a write(2)-like operation and should return ssize_t, just like extattr_get_*. Also, the user-space utility was using an int for the return value of extattr_get_* and extattr_list_*, both of which return an ssize_t. MFC after: 1 week
|
#
e948704e |
|
21-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Implement chflagsat(2) system call, similar to fchmodat(2), but operates on file flags. Reviewed by: kib, jilles Sponsored by: The FreeBSD Foundation
|
#
b4b2596b |
|
21-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Make 'flags' argument to chflags(2), fchflags(2) and lchflags(2) of type u_long. Before this change it was of type int for syscalls, but prototypes in sys/stat.h and documentation for chflags(2) and fchflags(2) (but not for lchflags(2)) stated that it was u_long. Now some related functions use u_long type for flags (strtofflags(3), fflagstostr(3)). - Make path argument of type 'const char *' for consistency. Discussed on: arch Sponsored by: The FreeBSD Foundation
|
#
7493f24e |
|
02-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Implement two new system calls: int bindat(int fd, int s, const struct sockaddr *addr, socklen_t addrlen); int connectat(int fd, int s, const struct sockaddr *name, socklen_t namelen); which allow to bind and connect respectively to a UNIX domain socket with a path relative to the directory associated with the given file descriptor 'fd'. - Add manual pages for the new syscalls. - Make the new syscalls available for processes in capability mode sandbox. - Add capability rights CAP_BINDAT and CAP_CONNECTAT that has to be present on the directory descriptor for the syscalls to work. - Update audit(4) to support those two new syscalls and to handle path in sockaddr_un structure relative to the given directory descriptor. - Update procstat(1) to recognize the new capability rights. - Document the new capability rights in cap_rights_limit(2). Sponsored by: The FreeBSD Foundation Discussed with: rwatson, jilles, kib, des
|
#
2609222a |
|
01-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Merge Capsicum overhaul: - Capability is no longer separate descriptor type. Now every descriptor has set of its own capability rights. - The cap_new(2) system call is left, but it is no longer documented and should not be used in new code. - The new syscall cap_rights_limit(2) should be used instead of cap_new(2), which limits capability rights of the given descriptor without creating a new one. - The cap_getrights(2) syscall is renamed to cap_rights_get(2). - If CAP_IOCTL capability right is present we can further reduce allowed ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed ioctls can be retrived with cap_ioctls_get(2) syscall. - If CAP_FCNTL capability right is present we can further reduce fcntls that can be used with the new cap_fcntls_limit(2) syscall and retrive them with cap_fcntls_get(2). - To support ioctl and fcntl white-listing the filedesc structure was heavly modified. - The audit subsystem, kdump and procstat tools were updated to recognize new syscalls. - Capability rights were revised and eventhough I tried hard to provide backward API and ABI compatibility there are some incompatible changes that are described in detail below: CAP_CREATE old behaviour: - Allow for openat(2)+O_CREAT. - Allow for linkat(2). - Allow for symlinkat(2). CAP_CREATE new behaviour: - Allow for openat(2)+O_CREAT. Added CAP_LINKAT: - Allow for linkat(2). ABI: Reuses CAP_RMDIR bit. - Allow to be target for renameat(2). Added CAP_SYMLINKAT: - Allow for symlinkat(2). Removed CAP_DELETE. Old behaviour: - Allow for unlinkat(2) when removing non-directory object. - Allow to be source for renameat(2). Removed CAP_RMDIR. Old behaviour: - Allow for unlinkat(2) when removing directory. Added CAP_RENAMEAT: - Required for source directory for the renameat(2) syscall. Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR): - Allow for unlinkat(2) on any object. - Required if target of renameat(2) exists and will be removed by this call. Removed CAP_MAPEXEC. CAP_MMAP old behaviour: - Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and PROT_WRITE. CAP_MMAP new behaviour: - Allow for mmap(2)+PROT_NONE. Added CAP_MMAP_R: - Allow for mmap(PROT_READ). Added CAP_MMAP_W: - Allow for mmap(PROT_WRITE). Added CAP_MMAP_X: - Allow for mmap(PROT_EXEC). Added CAP_MMAP_RW: - Allow for mmap(PROT_READ | PROT_WRITE). Added CAP_MMAP_RX: - Allow for mmap(PROT_READ | PROT_EXEC). Added CAP_MMAP_WX: - Allow for mmap(PROT_WRITE | PROT_EXEC). Added CAP_MMAP_RWX: - Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC). Renamed CAP_MKDIR to CAP_MKDIRAT. Renamed CAP_MKFIFO to CAP_MKFIFOAT. Renamed CAP_MKNODE to CAP_MKNODEAT. CAP_READ old behaviour: - Allow pread(2). - Disallow read(2), readv(2) (if there is no CAP_SEEK). CAP_READ new behaviour: - Allow read(2), readv(2). - Disallow pread(2) (CAP_SEEK was also required). CAP_WRITE old behaviour: - Allow pwrite(2). - Disallow write(2), writev(2) (if there is no CAP_SEEK). CAP_WRITE new behaviour: - Allow write(2), writev(2). - Disallow pwrite(2) (CAP_SEEK was also required). Added convinient defines: #define CAP_PREAD (CAP_SEEK | CAP_READ) #define CAP_PWRITE (CAP_SEEK | CAP_WRITE) #define CAP_MMAP_R (CAP_MMAP | CAP_SEEK | CAP_READ) #define CAP_MMAP_W (CAP_MMAP | CAP_SEEK | CAP_WRITE) #define CAP_MMAP_X (CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL) #define CAP_MMAP_RW (CAP_MMAP_R | CAP_MMAP_W) #define CAP_MMAP_RX (CAP_MMAP_R | CAP_MMAP_X) #define CAP_MMAP_WX (CAP_MMAP_W | CAP_MMAP_X) #define CAP_MMAP_RWX (CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X) #define CAP_RECV CAP_READ #define CAP_SEND CAP_WRITE #define CAP_SOCK_CLIENT \ (CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \ CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN) #define CAP_SOCK_SERVER \ (CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \ CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \ CAP_SETSOCKOPT | CAP_SHUTDOWN) Added defines for backward API compatibility: #define CAP_MAPEXEC CAP_MMAP_X #define CAP_DELETE CAP_UNLINKAT #define CAP_MKDIR CAP_MKDIRAT #define CAP_RMDIR CAP_UNLINKAT #define CAP_MKFIFO CAP_MKFIFOAT #define CAP_MKNOD CAP_MKNODAT #define CAP_SOCK_ALL (CAP_SOCK_CLIENT | CAP_SOCK_SERVER) Sponsored by: The FreeBSD Foundation Reviewed by: Christoph Mallon <christoph.mallon@gmx.de> Many aspects discussed with: rwatson, benl, jonathan ABI compatibility discussed with: kib
|
#
f13b5a0f |
|
12-Nov-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Add the wait6(2) system call. It takes POSIX waitid()-like process designator to select a process which is waited for. The system call optionally returns siginfo_t which would be otherwise provided to SIGCHLD handler, as well as extended structure accounting for child and cumulative grandchild resource usage. Allow to get the current rusage information for non-exited processes as well, similar to Solaris. The explicit WEXITED flag is required to wait for exited processes, allowing for more fine-grained control of the events the waiter is interested in. Fix the handling of siginfo for WNOWAIT option for all wait*(2) family, by not removing the queued signal state. PR: standards/170346 Submitted by: "Jukka A. Ukkonen" <jau@iki.fi> MFC after: 1 month
|
#
d65f1abc |
|
16-Aug-2012 |
David Xu <davidxu@FreeBSD.org> |
Implement syscall clock_getcpuclockid2, so we can get a clock id for process, thread or others we want to support. Use the syscall to implement POSIX API clock_getcpuclock and pthread_getcpuclockid. PR: 168417
|
#
520b6a84 |
|
25-May-2012 |
Ed Schouten <ed@FreeBSD.org> |
Remove use of non-ISO-C integer types from system call tables. These files already use ISO-C-style integer types, so make them less inconsistent by preferring the standard types.
|
#
cf13a585 |
|
20-Nov-2011 |
Lawrence Stewart <lstewart@FreeBSD.org> |
- Add the ffclock_getcounter(), ffclock_getestimate() and ffclock_setestimate() system calls to provide feed-forward clock management capabilities to userspace processes. ffclock_getcounter() returns the current value of the kernel's feed-forward clock counter. ffclock_getestimate() returns the current feed-forward clock parameter estimates and ffclock_setestimate() updates the feed-forward clock parameter estimates. - Document the syscalls in the ffclock.2 man page. - Regenerate the script-derived syscall related files. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Submitted by: Julien Ridoux (jridoux at unimelb edu au)
|
#
d3a993d4 |
|
18-Nov-2011 |
Ed Schouten <ed@FreeBSD.org> |
Improve *access*() parameter name consistency. The current code mixes the use of `flags' and `mode'. This is a bit confusing, since the faccessat() function as a `flag' parameter to store the AT_ flag. Make this less confusing by using the same name as used in the POSIX specification -- `amode'.
|
#
936c09ac |
|
03-Nov-2011 |
John Baldwin <jhb@FreeBSD.org> |
Add the posix_fadvise(2) system call. It is somewhat similar to madvise(2) except that it operates on a file descriptor instead of a memory region. It is currently only supported on regular files. Just as with madvise(2), the advice given to posix_fadvise(2) can be divided into two types. The first type provide hints about data access patterns and are used in the file read and write routines to modify the I/O flags passed down to VOP_READ() and VOP_WRITE(). These modes are thus filesystem independent. Note that to ease implementation (and since this API is only advisory anyway), only a single non-normal range is allowed per file descriptor. The second type of hints are used to hint to the OS that data will or will not be used. These hints are implemented via a new VOP_ADVISE(). A default implementation is provided which does nothing for the WILLNEED request and attempts to move any clean pages to the cache page queue for the DONTNEED request. This latter case required two other changes. First, a new V_CLEANONLY flag was added to vinvalbuf(). This requests vinvalbuf() to only flush clean buffers for the vnode from the buffer cache and to not remove any backing pages from the vnode. This is used to ensure clean pages are not wired into the buffer cache before attempting to move them to the cache page queue. The second change adds a new vm_object_page_cache() method. This method is somewhat similar to vm_object_page_remove() except that instead of freeing each page in the specified range, it attempts to move clean pages to the cache queue if possible. To preserve the ABI of struct file, the f_cdevpriv pointer is now reused in a union to point to the currently active advice region if one is present for regular files. Reviewed by: jilles, kib, arch@ Approved by: re (kib) MFC after: 1 month
|
#
cfb5f768 |
|
18-Aug-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Add experimental support for process descriptors A "process descriptor" file descriptor is used to manage processes without using the PID namespace. This is required for Capsicum's Capability Mode, where the PID namespace is unavailable. New system calls pdfork(2) and pdkill(2) offer the functional equivalents of fork(2) and kill(2). pdgetpid(2) allows querying the PID of the remote process for debugging purposes. The currently-unimplemented pdwait(2) will, in the future, allow querying rusage/exit status. In the interim, poll(2) may be used to check (and wait for) process termination. When a process is referenced by a process descriptor, it does not issue SIGCHLD to the parent, making it suitable for use in libraries---a common scenario when using library compartmentalisation from within large applications (such as web browsers). Some observers may note a similarity to Mach task ports; process descriptors provide a subset of this behaviour, but in a UNIX style. This feature is enabled by "options PROCDESC", but as with several other Capsicum kernel features, is not enabled by default in GENERIC 9.0. Reviewed by: jhb, kib Approved by: re (kib), mentor (rwatson) Sponsored by: Google Inc
|
#
cfb9df55 |
|
15-Jul-2011 |
Jonathan Anderson <jonathan@FreeBSD.org> |
Add cap_new() and cap_getrights() system calls. Implement two previously-reserved Capsicum system calls: - cap_new() creates a capability to wrap an existing file descriptor - cap_getrights() queries the rights mask of a capability. Approved by: mentor (rwatson), re (Capsicum blanket) Sponsored by: Google Inc
|
#
d91f88f7 |
|
18-Apr-2011 |
Matthew D Fleming <mdf@FreeBSD.org> |
Add the posix_fallocate(2) syscall. The default implementation in vop_stdallocate() is filesystem agnostic and will run as slow as a read/write loop in userspace; however, it serves to correctly implement the functionality for filesystems that do not implement a VOP_ALLOCATE. Note that __FreeBSD_version was already bumped today to 900036 for any ports which would like to use this function. Also reserve space in the syscall table for posix_fadvise(2). Reviewed by: -arch (previous version)
|
#
ec125fbb |
|
30-Mar-2011 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Add rctl. It's used by racct to take user-configurable actions based on the set of rules it maintains and the current resource usage. It also privides userland API to manage that ruleset. Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)
|
#
2bfc50bc |
|
04-Mar-2011 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Add two new system calls, setloginclass(2) and getloginclass(2). This makes it possible for the kernel to track login class the process is assigned to, which is required for RCTL. This change also make setusercontext(3) call setloginclass(2) and makes it possible to retrieve current login class using id(1). Reviewed by: kib (as part of a larger patch)
|
#
96fcc75f |
|
01-Mar-2011 |
Robert Watson <rwatson@FreeBSD.org> |
Add initial support for Capsicum's Capability Mode to the FreeBSD kernel, compiled conditionally on options CAPABILITIES: Add a new credential flag, CRED_FLAG_CAPMODE, which indicates that a subject (typically a process) is in capability mode. Add two new system calls, cap_enter(2) and cap_getmode(2), which allow setting and querying (but never clearing) the flag. Export the capability mode flag via process information sysctls. Sponsored by: Google, Inc. Reviewed by: anderson Discussed with: benl, kris, pjd Obtained from: Capsicum Project MFC after: 3 months
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
8d19559b |
|
30-Aug-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Make the syscalls reserved for AFS usable by OpenAFS port. Submitted by: Benjamin Kaduk <kaduk mit edu> MFC after: 2 weeks
|
#
13561ed4 |
|
26-Aug-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix typo. Submitted by: Ben Kaduk <minimarmot gmail com>
|
#
153ac44c |
|
28-Jun-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Count number of threads that enter and leave dynamically registered syscalls. On the dynamic syscall deregistration, wait until all threads leave the syscall code. This somewhat increases the safety of the loadable modules unloading. Reviewed by: jhb Tested by: pho MFC after: 1 month
|
#
790f66db |
|
08-Feb-2010 |
Ed Schouten <ed@FreeBSD.org> |
Remove unused LIBCOMPAT keyword from syscalls.master.
|
#
7e767511 |
|
19-Dec-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r198508, r198509: Reimplement pselect() in kernel, making change of sigmask and sleep atomic. MFC r198538: Move pselect(3) man page to section 2.
|
#
304e9b14 |
|
13-Dec-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Merge r197636 from head to stable/8: Reserve system call numbers for Capsicum security framework capabilities, capability mode, and process descriptors: cap_new, cap_getrights, cap_enter, cap_getmode, pdfork, pdkill, pdgetpid, and pdwait. Obtained from: TrustedBSD Project Sponsored by: Google
|
#
066d836b |
|
27-Oct-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Current pselect(3) is implemented in usermode and thus vulnerable to well-known race condition, which elimination was the reason for the function appearance in first place. If sigmask supplied as argument to pselect() enables a signal, the signal might be delivered before thread called select(2), causing lost wakeup. Reimplement pselect() in kernel, making change of sigmask and sleep atomic. Since signal shall be delivered to the usermode, but sigmask restored, set TDP_OLDMASK and save old mask in td_oldsigmask. The TDP_OLDMASK should be cleared by ast() in case signal was not gelivered during syscall execution. Reviewed by: davidxu Tested by: pho MFC after: 1 month
|
#
72c35fc6 |
|
30-Sep-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Reserve system call numbers for Capsicum security framework capabilities, capability mode, and process descriptors: cap_new, cap_getrights, cap_enter, cap_getmode, pdfork, pdkill, pdgetpid, and pdwait. Obtained from: TrustedBSD Project Sponsored by: Google MFC after: 3 weeks
|
#
c3889811 |
|
08-Jul-2009 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
There is an optimization in chmod(1), that makes it not to call chmod(2) if the new file mode is the same as it was before; however, this optimization must be disabled for filesystems that support NFSv4 ACLs. Chmod uses pathconf(2) to determine whether this is the case - however, pathconf(2) always follows symbolic links, while the 'chmod -h' doesn't. This change adds lpathconf(3) to make it possible to solve that problem in a clean way. Reviewed by: rwatson (earlier version) Approved by: re (kib)
|
#
b648d480 |
|
24-Jun-2009 |
John Baldwin <jhb@FreeBSD.org> |
Change the ABI of some of the structures used by the SYSV IPC API: - The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned short. - The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned short. - The mode member of struct ipc_perm is now mode_t instead of unsigned short (this is merely a style bug). - The rather dubious padding fields for ABI compat with SV/I386 have been removed from struct msqid_ds and struct semid_ds. - The shm_segsz member of struct shmid_ds is now a size_t instead of an int. This removes the need for the shm_bsegsz member in struct shmid_kernel and should allow for complete support of SYSV SHM regions >= 2GB. - The shm_nattch member of struct shmid_ds is now an int instead of a short. - The shm_internal member of struct shmid_ds is now gone. The internal VM object pointer for SHM regions has been moved into struct shmid_kernel. - The existing __semctl(), msgctl(), and shmctl() system call entries are now marked COMPAT7 and new versions of those system calls which support the new ABI are now present. - The new system calls are assigned to the FBSD-1.1 version in libc. The FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls. - A simplistic framework for tagging system calls with compatibility symbol versions has been added to libc. Version tags are added to system calls by adding an appropriate __sym_compat() entry to src/lib/libc/incldue/compat.h. [1] PR: kern/16195 kern/113218 bin/129855 Reviewed by: arch@, rwatson Discussed with: kan, kib [1]
|
#
45f48220 |
|
24-Jun-2009 |
John Baldwin <jhb@FreeBSD.org> |
Deprecate the msgsys(), semsys(), and shmsys() system calls by moving them under COMPAT_FREEBSD[4567]. Starting with FreeBSD 5.0 the SYSV IPC API was implemented via direct system calls (e.g. msgctl(), msgget(), etc.) rather than indirecting through the var-args *sys() system calls. The shmsys() system call was already effectively deprecated for all but COMPAT_FREEBSD4 already as its implementation for the !COMPAT_FREEBSD4 case was to simply invoke nosys().
|
#
3c366f1f |
|
24-Jun-2009 |
John Baldwin <jhb@FreeBSD.org> |
Add a new COMPAT7 flag for FreeBSD 7.x compatibility system calls.
|
#
0b0fe06a |
|
22-Jun-2009 |
John Baldwin <jhb@FreeBSD.org> |
Fix a typo in a comment.
|
#
21def99b |
|
17-Jun-2009 |
John Baldwin <jhb@FreeBSD.org> |
- Add the ability to mix multiple flags seperated by pipe ('|') characters in the type field of system call tables. Specifically, one can now use the 'NO*' types as flags in addition to the 'COMPAT*' types. For example, to tag 'COMPAT*' system calls as living in a KLD via NOSTD. The COMPAT* type is required to be listed first in this case. - Add new functions 'type()' and 'flag()' to the embedded awk script in makesyscalls.sh that return true if a requested flag is found in the type field ($3). The flag() function checks all of the flags in the field, but type() only checks the first flag. type() is meant to be used in the top-level "switch" statement and flag() should be used otherwise. - Retire the CPT_NOA type, it is now replaced with "COMPAT|NOARGS" using the flags approach. - Tweak the comment descriptions of COMPAT[46] system calls so that they say "freebsd[46] foo" rather than "old foo". - Document the COMPAT6 type. - Sync comments in compat32 syscall table with the master table.
|
#
0ec0b41c |
|
17-Jun-2009 |
John Baldwin <jhb@FreeBSD.org> |
Remove the now-unused NOIMPL flag. It serves no useful purpose given the existing UNIMPL and NOSTD types.
|
#
f4258391 |
|
17-Jun-2009 |
John Baldwin <jhb@FreeBSD.org> |
- NOSTD results in lkmressys being used instead of lkmssys. - Mark nfsclnt as UNIMPL. It should have been NOSTD instead of NOIMPL back when it lived in nfsclient.ko, but it was removed from that a long time ago.
|
#
c4f16b69 |
|
15-Jun-2009 |
John Baldwin <jhb@FreeBSD.org> |
Add a new 'void closefrom(int lowfd)' system call. When called, it closes any open file descriptors >= 'lowfd'. It is largely identical to the same function on other operating systems such as Solaris, DFly, NetBSD, and OpenBSD. One difference from other *BSD is that this closefrom() does not fail with any errors. In practice, while the manpages for NetBSD and OpenBSD claim that they return EINTR, they ignore internal errors from close() and never return EINTR. DFly does return EINTR, but for the common use case (closing fd's prior to execve()), the caller really wants all fd's closed and returning EINTR just forces callers to call closefrom() in a loop until it stops failing. Note that this implementation of closefrom(2) does not make any effort to resolve userland races with open(2) in other threads. As such, it is not multithread safe. Submitted by: rwatson (initial version) Reviewed by: rwatson MFC after: 2 weeks
|
#
b38ff370 |
|
29-Apr-2009 |
Jamie Gritton <jamie@FreeBSD.org> |
Introduce the extensible jail framework, using the same "name=value" interface as nmount(2). Three new system calls are added: * jail_set, to create jails and change the parameters of existing jails. This replaces jail(2). * jail_get, to read the parameters of existing jails. This replaces the security.jail.list sysctl. * jail_remove to kill off a jail's processes and remove the jail. Most jail parameters may now be changed after creation, and jails may be set to exist without any attached processes. The current jail(2) system call still exists, though it is now a stub to jail_set(2). Approved by: bz (mentor)
|
#
a1b5a895 |
|
09-Nov-2008 |
Ed Schouten <ed@FreeBSD.org> |
Mark uname(), getdomainname() and setdomainname() with COMPAT_FREEBSD4. Looking at our source code history, it seems the uname(), getdomainname() and setdomainname() system calls got deprecated somewhere after FreeBSD 1.1, but they have never been phased out properly. Because we don't have a COMPAT_FREEBSD1, just use COMPAT_FREEBSD4. Also fix the Linuxolator to build without the setdomainname() routine by just making it call userland_sysctl on kern.domainname. Also replace the setdomainname()'s implementation to use this approach, because we're duplicating code with sysctl_domainname(). I wasn't able to keep these three routines working in our COMPAT_FREEBSD32, because that would require yet another keyword for syscalls.master (COMPAT4+NOPROTO). Because this routine is probably unused already, this won't be a problem in practice. If it turns out to be a problem, we'll just restore this functionality. Reviewed by: rdivacky, kib
|
#
a9148abd |
|
03-Nov-2008 |
Doug Rabson <dfr@FreeBSD.org> |
Implement support for RPCSEC_GSS authentication to both the NFS client and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month
|
#
d7f03759 |
|
19-Oct-2008 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Import the HEAD csup code which is the basis for the cvsmode work.
|
#
48a43ae8 |
|
25-Sep-2008 |
John Baldwin <jhb@FreeBSD.org> |
Tidy up a few things with syscall generation: - Instead of using a syscall slot (370) just to get a function prototype for lkmressys(), add an explicit function prototype to <sys/sysent.h>. This also removes unused special case checks for 'lkmressys' from makesyscalls.sh. - Instead of having magic logic in makesyscalls.sh to only generate a function prototype the first time 'lkmnosys' is seen, make 'NODEF' always not generate a function prototype and include an explicit prototype for 'lkmnosys' in <sys/sysent.h>. - As a result of the fix in (2), update the LKM syscall entries in the freebsd32 syscall table to use 'lkmnosys' rather than 'nosys'. - Use NOPROTO for the __syscall() entry (198) in the native ABI. This avoids the need for magic logic in makesyscalls.h to only generate a function prototype the first time 'nosys' is encountered.
|
#
e484af13 |
|
24-Aug-2008 |
Robert Watson <rwatson@FreeBSD.org> |
When MPSAFE ttys were merged, a new BSM audit event identifier was allocated for posix_openpt(2). Unfortunately, that identifier conflicts with other events already allocated to other systems in OpenBSM. Assign a new globally unique identifier and conform better to the AUE_ event naming scheme. This is a stopgap until a new OpenBSM import is done with the correct identifier, so we'll maintain this as a local diff in svn until then. Discussed with: ed Obtained from: TrustedBSD Project
|
#
35c316ca |
|
21-Aug-2008 |
David E. O'Brien <obrien@FreeBSD.org> |
Add comments on NOARGS, NODEF, and NOPROTO.
|
#
bc093719 |
|
20-Aug-2008 |
Ed Schouten <ed@FreeBSD.org> |
Integrate the new MPSAFE TTY layer to the FreeBSD operating system. The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan
|
#
8b07e49a |
|
09-May-2008 |
Julian Elischer <julian@FreeBSD.org> |
Add code to allow the system to handle multiple routing tables. This particular implementation is designed to be fully backwards compatible and to be MFC-able to 7.x (and 6.x) Currently the only protocol that can make use of the multiple tables is IPv4 Similar functionality exists in OpenBSD and Linux. From my notes: ----- One thing where FreeBSD has been falling behind, and which by chance I have some time to work on is "policy based routing", which allows different packet streams to be routed by more than just the destination address. Constraints: ------------ I want to make some form of this available in the 6.x tree (and by extension 7.x) , but FreeBSD in general needs it so I might as well do it in -current and back port the portions I need. One of the ways that this can be done is to have the ability to instantiate multiple kernel routing tables (which I will now refer to as "Forwarding Information Bases" or "FIBs" for political correctness reasons). Which FIB a particular packet uses to make the next hop decision can be decided by a number of mechanisms. The policies these mechanisms implement are the "Policies" referred to in "Policy based routing". One of the constraints I have if I try to back port this work to 6.x is that it must be implemented as a EXTENSION to the existing ABIs in 6.x so that third party applications do not need to be recompiled in timespan of the branch. This first version will not have some of the bells and whistles that will come with later versions. It will, for example, be limited to 16 tables in the first commit. Implementation method, Compatible version. (part 1) ------------------------------- For this reason I have implemented a "sufficient subset" of a multiple routing table solution in Perforce, and back-ported it to 6.x. (also in Perforce though not always caught up with what I have done in -current/P4). The subset allows a number of FIBs to be defined at compile time (8 is sufficient for my purposes in 6.x) and implements the changes needed to allow IPV4 to use them. I have not done the changes for ipv6 simply because I do not need it, and I do not have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it. Other protocol families are left untouched and should there be users with proprietary protocol families, they should continue to work and be oblivious to the existence of the extra FIBs. To understand how this is done, one must know that the current FIB code starts everything off with a single dimensional array of pointers to FIB head structures (One per protocol family), each of which in turn points to the trie of routes available to that family. The basic change in the ABI compatible version of the change is to extent that array to be a 2 dimensional array, so that instead of protocol family X looking at rt_tables[X] for the table it needs, it looks at rt_tables[Y][X] when for all protocol families except ipv4 Y is always 0. Code that is unaware of the change always just sees the first row of the table, which of course looks just like the one dimensional array that existed before. The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign() are all maintained, but refer only to the first row of the array, so that existing callers in proprietary protocols can continue to do the "right thing". Some new entry points are added, for the exclusive use of ipv4 code called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(), which have an extra argument which refers the code to the correct row. In addition, there are some new entry points (currently called rtalloc_fib() and friends) that check the Address family being looked up and call either rtalloc() (and friends) if the protocol is not IPv4 forcing the action to row 0 or to the appropriate row if it IS IPv4 (and that info is available). These are for calling from code that is not specific to any particular protocol. The way these are implemented would change in the non ABI preserving code to be added later. One feature of the first version of the code is that for ipv4, the interface routes show up automatically on all the FIBs, so that no matter what FIB you select you always have the basic direct attached hosts available to you. (rtinit() does this automatically). You CAN delete an interface route from one FIB should you want to but by default it's there. ARP information is also available in each FIB. It's assumed that the same machine would have the same MAC address, regardless of which FIB you are using to get to it. This brings us as to how the correct FIB is selected for an outgoing IPV4 packet. Firstly, all packets have a FIB associated with them. if nothing has been done to change it, it will be FIB 0. The FIB is changed in the following ways. Packets fall into one of a number of classes. 1/ locally generated packets, coming from a socket/PCB. Such packets select a FIB from a number associated with the socket/PCB. This in turn is inherited from the process, but can be changed by a socket option. The process in turn inherits it on fork. I have written a utility call setfib that acts a bit like nice.. setfib -3 ping target.example.com # will use fib 3 for ping. It is an obvious extension to make it a property of a jail but I have not done so. It can be achieved by combining the setfib and jail commands. 2/ packets received on an interface for forwarding. By default these packets would use table 0, (or possibly a number settable in a sysctl(not yet)). but prior to routing the firewall can inspect them (see below). (possibly in the future you may be able to associate a FIB with packets received on an interface.. An ifconfig arg, but not yet.) 3/ packets inspected by a packet classifier, which can arbitrarily associate a fib with it on a packet by packet basis. A fib assigned to a packet by a packet classifier (such as ipfw) would over-ride a fib associated by a more default source. (such as cases 1 or 2). 4/ a tcp listen socket associated with a fib will generate accept sockets that are associated with that same fib. 5/ Packets generated in response to some other packet (e.g. reset or icmp packets). These should use the FIB associated with the packet being reponded to. 6/ Packets generated during encapsulation. gif, tun and other tunnel interfaces will encapsulate using the FIB that was in effect withthe proces that set up the tunnel. thus setfib 1 ifconfig gif0 [tunnel instructions] will set the fib for the tunnel to use to be fib 1. Routing messages would be associated with their process, and thus select one FIB or another. messages from the kernel would be associated with the fib they refer to and would only be received by a routing socket associated with that fib. (not yet implemented) In addition Netstat has been edited to be able to cope with the fact that the array is now 2 dimensional. (It looks in system memory using libkvm (!)). Old versions of netstat see only the first FIB. In addition two sysctls are added to give: a) the number of FIBs compiled in (active) b) the default FIB of the calling process. Early testing experience: ------------------------- Basically our (IronPort's) appliance does this functionality already using ipfw fwd but that method has some drawbacks. For example, It can't fully simulate a routing table because it can't influence the socket's choice of local address when a connect() is done. Testing during the generating of these changes has been remarkably smooth so far. Multiple tables have co-existed with no notable side effects, and packets have been routes accordingly. ipfw has grown 2 new keywords: setfib N ip from anay to any count ip from any to any fib N In pf there seems to be a requirement to be able to give symbolic names to the fibs but I do not have that capacity. I am not sure if it is required. SCTP has interestingly enough built in support for this, called VRFs in Cisco parlance. it will be interesting to see how that handles it when it suddenly actually does something. Where to next: -------------------- After committing the ABI compatible version and MFCing it, I'd like to proceed in a forward direction in -current. this will result in some roto-tilling in the routing code. Firstly: the current code's idea of having a separate tree per protocol family, all of the same format, and pointed to by the 1 dimensional array is a bit silly. Especially when one considers that there is code that makes assumptions about every protocol having the same internal structures there. Some protocols don't WANT that sort of structure. (for example the whole idea of a netmask is foreign to appletalk). This needs to be made opaque to the external code. My suggested first change is to add routing method pointers to the 'domain' structure, along with information pointing the data. instead of having an array of pointers to uniform structures, there would be an array pointing to the 'domain' structures for each protocol address domain (protocol family), and the methods this reached would be called. The methods would have an argument that gives FIB number, but the protocol would be free to ignore it. When the ABI can be changed it raises the possibilty of the addition of a fib entry into the "struct route". Currently, the structure contains the sockaddr of the desination, and the resulting fib entry. To make this work fully, one could add a fib number so that given an address and a fib, one can find the third element, the fib entry. Interaction with the ARP layer/ LL layer would need to be revisited as well. Qing Li has been working on this already. This work was sponsored by Ironport Systems/Cisco Reviewed by: several including rwatson, bz and mlair (parts each) Obtained from: Ironport systems/Cisco
|
#
7104518b |
|
30-Mar-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Add the openat(), fexecve() and other *at() syscalls to the table. Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho
|
#
dfdcada3 |
|
26-Mar-2008 |
Doug Rabson <dfr@FreeBSD.org> |
Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks
|
#
7f64829a |
|
25-Mar-2008 |
Ruslan Ermilov <ru@FreeBSD.org> |
Fixed type of the fourth argument of cpuset_{get,set}affinity(2) to be size_t. Prodded by: davidxu
|
#
6617724c |
|
12-Mar-2008 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
|
#
d7f687fc |
|
02-Mar-2008 |
Jeff Roberson <jeff@FreeBSD.org> |
Add cpuset, an api for thread to cpu binding and cpu resource grouping and assignment. - Add a reference to a struct cpuset in each thread that is inherited from the thread that created it. - Release the reference when the thread is destroyed. - Add prototypes for syscalls and macros for manipulating cpusets in sys/cpuset.h - Add syscalls to create, get, and set new numbered cpusets: cpuset(), cpuset_{get,set}id() - Add syscalls for getting and setting affinity masks for cpusets or individual threads: cpuid_{get,set}affinity() - Add types for the 'level' and 'which' parameters for the cpuset. This will permit expansion of the api to cover cpu masks for other objects identifiable with an id_t integer. For example, IRQs and Jails may be coming soon. - The root set 0 contains all valid cpus. All thread initially belong to cpuset 1. This permits migrating all threads off of certain cpus to reserve them for special applications. Sponsored by: Nokia Discussed with: arch, rwatson, brooks, davidxu, deischen Reviewed by: antoine
|
#
5f56182b |
|
12-Feb-2008 |
Ruslan Ermilov <ru@FreeBSD.org> |
Change readlink(2)'s return type and type of the last argument to match POSIX. Prodded by: Alexey Lyashkov
|
#
6c902059 |
|
20-Jan-2008 |
Robert Watson <rwatson@FreeBSD.org> |
Use audit events AUE_SHMOPEN and AUE_SHMUNLINK with new system calls shm_open() and shm_unlink(). More auditing will need to be done for these calls to capture arguments properly.
|
#
8e38aeff |
|
08-Jan-2008 |
John Baldwin <jhb@FreeBSD.org> |
Add a new file descriptor type for IPC shared memory objects and use it to implement shm_open(2) and shm_unlink(2) in the kernel: - Each shared memory file descriptor is associated with a swap-backed vm object which provides the backing store. Each descriptor starts off with a size of zero, but the size can be altered via ftruncate(2). The shared memory file descriptors also support fstat(2). read(2), write(2), ioctl(2), select(2), poll(2), and kevent(2) are not supported on shared memory file descriptors. - shm_open(2) and shm_unlink(2) are now implemented as system calls that manage shared memory file descriptors. The virtual namespace that maps pathnames to shared memory file descriptors is implemented as a hash table where the hash key is generated via the 32-bit Fowler/Noll/Vo hash of the pathname. - As an extension, the constant 'SHM_ANON' may be specified in place of the path argument to shm_open(2). In this case, an unnamed shared memory file descriptor will be created similar to the IPC_PRIVATE key for shmget(2). Note that the shared memory object can still be shared among processes by sharing the file descriptor via fork(2) or sendmsg(2), but it is unnamed. This effectively serves to implement the getmemfd() idea bandied about the lists several times over the years. - The backing store for shared memory file descriptors are garbage collected when they are not referenced by any open file descriptors or the shm_open(2) virtual namespace. Submitted by: dillon, peter (previous versions) Submitted by: rwatson (I based this on his version) Reviewed by: alc (suggested converting getmemfd() to shm_open())
|
#
7188e3c8 |
|
19-Oct-2007 |
Ed Maste <emaste@FreeBSD.org> |
Put comments about syscalls by the correct ones, and use the correct syscall number in the comment.
|
#
0b1f0611 |
|
15-Aug-2007 |
David Xu <davidxu@FreeBSD.org> |
Add thr_kill2 syscall which sends a signal to a thread in another process. Submitted by: Tijl Coosemans tijl at ulyssis dot org Approved by: re (kensmith)
|
#
51504d9a |
|
04-Jul-2007 |
Peter Wemm <peter@FreeBSD.org> |
Create new syscalls for mmap(), lseek(), pread(), pwrite(), truncate() and ftruncate(), but without the pad arg. There are several reasons for this. Consider 'mmap()'. On AMD64, the function call (and syscall) ABI allow for 6 register arguments. Additional arguments go on the stack. mmap(2) has 6 arguments. However, the syscall definition has an extra 'int pad' argument. This pushes it to 7 arguments, which means one must spill into the memory stack. Since the kernel API doesn't match userland API, we have a hack in libc - libc/sys/mmap.c. This implements the userland API by calling __syscall() with an extra argument and the pad argument, for a total of 8 args. This is all unnecessary and inconvenient for several things, including the kernel's syscall handler code which now has to handle merging stack arguments with register arguments. It is a big deal for certain 3rd party code. I'm adding libc glue to make the transition totally painless. I had intended to mark the old syscalls as COMPAT6, but the potential to shoot your feet by building a new kernel without COMPAT_FREEBSD6 but with a slighly older userland was too great. For now, they have manual "freebsd6_" prefixes rather than being COMPAT6. They will go back to being marked 'COMPAT6' after 7-stable starts. Approved by: re (kensmith)
|
#
f8829a4a |
|
03-Nov-2006 |
Randall Stewart <rrs@FreeBSD.org> |
Ok, here it is, we finally add SCTP to current. Note that this work is not just mine, but it is also the works of Peter Lei and Michael Tuexen. They both are my two key other developers working on the project.. and they need ata-boy's too: **** peterlei@cisco.com tuexen@fh-muenster.de **** I did do a make sysent which updated the syscall's and sysproto.. I hope that is correct... without it you don't build since we have new syscalls for SCTP :-0 So go out and look at the NOTES, add option SCTP (make sure inet and inet6 are present too) and play with SCTP. I will see about comitting some test tools I have after I figure out where I should place them. I also have a lib (libsctp.a) that adds some of the missing socketapi functions that I need to put into lib's.. I will talk to George about this :-) There may still be some 64 bit issues in here, none of us have a 64 bit processor to test with yet.. Michael may have a MAC but thats another beast too.. If you have a mac and want to use SCTP contact Michael he maintains a web site with a loadable module with this code :-) Reviewed by: gnn Approved by: gnn
|
#
5f641fc0 |
|
16-Oct-2006 |
David Xu <davidxu@FreeBSD.org> |
o Add keyword volatile for user mutex owner field. o Fix type consistent problem by using type long for old umtx and wait channel. o Rename casuptr to casuword.
|
#
888db9e1 |
|
03-Oct-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Audit creat() system call (compat code), and change type for getpagesize(), which isn't actually being audited anyway. MFC after: 3 days Obtained from: TrustedBSD Project
|
#
73fa3e5b |
|
20-Sep-2006 |
David Xu <davidxu@FreeBSD.org> |
Replace system call thr_getscheduler, thr_setscheduler, thr_setschedparam with rtprio_thread, while rtprio system call is for process only, the new system call rtprio_thread is responsible for LWP.
|
#
6c2d307a |
|
17-Sep-2006 |
Robert Watson <rwatson@FreeBSD.org> |
AUE_SIGALTSTACK instead of AUE_SIGPENDING for sigaltstack(). Obtained from: TrustedBSD Project MFC after: 3 days
|
#
7f26ddda |
|
03-Sep-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Assign proper audit event identifiers to a number of system calls not covered in previous passes: - sysarch, rtprio - clock_settime - preadv/pwritev - __getcwd - kqueue - fhstatfs - kldunloadf Obtained from: TrustedBSD Project
|
#
d1967c5d |
|
03-Sep-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Use AUE_NTP_ADJTIME for ntp_adjtime() instead of AUE_ADJTIME. Obtained from: TrustedBSD Project
|
#
d10183d9 |
|
27-Aug-2006 |
David Xu <davidxu@FreeBSD.org> |
This is initial version of POSIX priority mutex support, a new userland mutex structure is added as following: struct umutex { __lwpid_t m_owner; uint32_t m_flags; uint32_t m_ceilings[2]; uint32_t m_spare[4]; }; The m_owner represents owner thread, it is a thread id, in non-contested case, userland can simply use atomic_cmpset_int to lock the mutex, if the mutex is contested, high order bit will be set, and userland should do locking and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure and kernel syscall should be invoked to do locking, this becauses for such a mutex, kernel should always boost the thread's priority before it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex, the first element is used to boost thread's priority when it locked the mutex, second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT mutex's link list is kept in userland, the m_ceiling[1] is managed by thread library so kernel needn't allocate memory to keep the link list, when such a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED. Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process shared, if the flag is not set, it saves a vm_map_lookup() call. The umtx chain is still used as a sleep queue, when a thread is blocked on PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority propagating, it is dynamically allocated and reference count is used, it is not optimized but works well in my tests, while the umtx chain has its own locking protocol, the priority propagating protocol are all protected by sched_lock because priority propagating function is called with sched_lock held from scheduler. No visible performance degradation is found which these changes. Some parameter names in _umtx_op syscall are renamed.
|
#
bad9a7a5 |
|
16-Aug-2006 |
Peter Wemm <peter@FreeBSD.org> |
Grab two syscall numbers. One is used to emulate functionality that linux has in its procfs (do a readlink of /proc/self/fd/<nn> to find the pathname that corresponds to a given file descriptor). Valgrind-3.x needs this functionality. This is a placeholder only at this time.
|
#
589201fd |
|
15-Aug-2006 |
John Baldwin <jhb@FreeBSD.org> |
- Use NOSTD rather than NOIMPL for nfssvc() to match other syscalls provided via klds. - Correct audit identifier for nfssvc().
|
#
af5bf122 |
|
28-Jul-2006 |
John Baldwin <jhb@FreeBSD.org> |
Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to mark system calls as being MPSAFE: - Stop conditionally acquiring Giant around system call invocations. - Remove all of the 'M' prefixes from the master system call files. - Remove support for the 'M' prefix from the script that generates the syscall-related files from the master system call files. - Don't explicitly set SYF_MPSAFE when registering nfssvc.
|
#
e0b4add8 |
|
28-Jul-2006 |
John Baldwin <jhb@FreeBSD.org> |
Various fixes to comments in the syscall master files including removing cruft from the audit import and adding mention of COMPAT4 to freebsd32.
|
#
60088160 |
|
13-Jul-2006 |
David Xu <davidxu@FreeBSD.org> |
Add syscalls thr_setscheduler, thr_getscheduler, and thr_setschedparam, these syscalls are designed to set thread's scheduling parameters and policy, because each syscall contains a size parameter, it is possible to support future scheduling option, e.g SCHED_SPORADIC, this option needs other fields in structure sched_param, current they are not avaiblable.
|
#
be5747d5 |
|
11-Jul-2006 |
John Baldwin <jhb@FreeBSD.org> |
- Add conditional VFS Giant locking to getdents_common() (linux ABIs), ibcs2_getdents(), ibcs2_read(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() similar to that in getdirentries(). - Mark ibcs2_getdents(), ibcs2_read(), linux_getdents(), linux_getdents64(), linux_readdir(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() MPSAFE.
|
#
bbe5d031 |
|
05-Jul-2006 |
Wayne Salamon <wsalamon@FreeBSD.org> |
Add audit events for the extended attribute system calls. Obtained from: TrustedBSD Project Approved by: rwatson (mentor)
|
#
597d608f |
|
27-Jun-2006 |
John Baldwin <jhb@FreeBSD.org> |
- Expand the scope of Giant some in mount(2) to protect the vfsp structure from going away. mount(2) is now MPSAFE. - Expand the scope of Giant some in unmount(2) to protect the mp structure (or rather, to handle concurrent unmount races) from going away. umount(2) is now MPSAFE, as well as linux_umount() and linux_oldumount(). - nmount(2) and linux_mount() were already MPSAFE.
|
#
867c089b |
|
28-Mar-2006 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Revert previous commit at davidxu's insistance. Instead, use __DECONST (argh!) and rearrange the prototypes to make it clear that _umtx_op() is not deprecated.
|
#
b3efbabe |
|
28-Mar-2006 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
The undocumented and deprecated system call _umtx_op() takes two pointer arguments. The first one is never used (all callers pass in 0); the second is sometimes used to pass in a struct timespec * which is used as a timeout and never modified. Constify that argument so callers can pass a const struct timespec * without jumping through hoops.
|
#
99eee864 |
|
23-Mar-2006 |
David Xu <davidxu@FreeBSD.org> |
Implement aio_fsync() syscall.
|
#
61d3a4ef |
|
28-Feb-2006 |
David Xu <davidxu@FreeBSD.org> |
Let kernel POSIX timer code and mqueue code to use integer as a resource handle, the timer_t and mqd_t types will be a pointer which userland will define it.
|
#
c983324e |
|
05-Feb-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Prefer AUE_FOO audit identifiers to AUE_O_FOO, which are largely left over from the Darwin implementation. When we implement a system call as a wrapper to sysctl(), audit it as AUE_SYSCTL. This leads to greater compatibility with Solaris audit trails as sysctl() argument tokens are not the same as the ones for the originaly system calls (i.e., setdomainname()). Replace references to AUE_ events that are equivilent to AUE_NULL with AUE_NULL. In the case of process signal configuration, this is because these events do not require auditing. Move from the Darwin spelling of getsockopt() to the FreeBSD/Solaris one. Audit nmount(). Obtained from: TrustedBSD Project
|
#
9e7d7224 |
|
04-Feb-2006 |
David Xu <davidxu@FreeBSD.org> |
Implement thr_set_name to set a name for thread. Reviewed by: julian
|
#
62646c07 |
|
03-Feb-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Assign audit event identifiers to many system calls. Much work by: wsalamon Obtained from: TrustedBSD Project
|
#
35d29f50 |
|
01-Feb-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Map audit-related system calls to audit event identifiers. Much work by: wsalamon Obtained from: TrustedBSD Project
|
#
1ce91824 |
|
21-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
Make aio code MP safe.
|
#
5a56b437 |
|
23-Dec-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add abort2() systemcall.
|
#
94e1294b |
|
26-Nov-2005 |
David Xu <davidxu@FreeBSD.org> |
Don't use OpenBSD syscall numbers, instead, use new syscall numbers for POSIX message queue. Suggested by: rwatson
|
#
655291f2 |
|
25-Nov-2005 |
David Xu <davidxu@FreeBSD.org> |
Bring in experimental kernel support for POSIX message queue.
|
#
0972628a |
|
29-Oct-2005 |
David Xu <davidxu@FreeBSD.org> |
Fix sigevent's POSIX incompatible problem by adding member fields sigev_notify_function and sigev_notify_attributes. AIO syscalls use sigevent, so they have to be adjusted. Reviewed by: alc
|
#
86857b36 |
|
22-Oct-2005 |
David Xu <davidxu@FreeBSD.org> |
Implement POSIX timers. Current only CLOCK_REALTIME and CLOCK_MONOTONIC clock are supported. I have plan to merge XSI timer ITIMER_REAL and other two CPU timers into the new code, current three slots are available for the XSI timers. The SIGEV_THREAD notification type is not supported yet because our sigevent struct lacks of two member fields: sigev_notify_function sigev_notify_attributes I have found the sigevent is used in AIO, so I won't add the two members unless the AIO code is adjusted.
|
#
d60e86c8 |
|
18-Oct-2005 |
Stefan Farfeleder <stefanf@FreeBSD.org> |
Const-qualify ksem_timedwait's parameter abstime as it's only passed in.
|
#
9104847f |
|
13-Oct-2005 |
David Xu <davidxu@FreeBSD.org> |
1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64
|
#
7f300b47 |
|
27-Sep-2005 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Mark the extended attribute syscalls as being MP safe. Requested by: jhb
|
#
4acd2e73 |
|
08-Jul-2005 |
John Baldwin <jhb@FreeBSD.org> |
Mark second instance of lchown() MP safe just like the first. Approved by: re (scottl)
|
#
bcd9e0dd |
|
07-Jul-2005 |
John Baldwin <jhb@FreeBSD.org> |
- Add two new system calls: preadv() and pwritev() which are like readv() and writev() except that they take an additional offset argument and do not change the current file position. In SAT speak: preadv:readv::pread:read and pwritev:writev::pwrite:write. - Try to reduce code duplication some by merging most of the old kern_foov() and dofilefoo() functions into new dofilefoo() functions that are called by kern_foov() and kern_pfoov(). The non-v functions now all generate a simple uio on the stack from the passed in arguments and then call kern_foov(). For example, read() now just builds a uio and calls kern_readv() and pwrite() just builds a uio and calls kern_pwritev(). PR: kern/80362 Submitted by: Marc Olzheim marcolz at stack dot nl (1) Approved by: re (scottl) MFC after: 1 week
|
#
f3596e33 |
|
30-May-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Introduce a new field in the syscalls.master file format to hold the audit event identifier associated with each system call, which will be stored by makesyscalls.sh in the sy_auevent field of struct sysent. For now, default the audit identifier on all system calls to AUE_NULL, but in the near future, other BSM event identifiers will be used. The mapping of system calls to event identifiers is many:one due to multiple system calls that map to the same end functionality across compatibility wrappers, ABI wrappers, etc. Submitted by: wsalamon Obtained from: TrustedBSD Project
|
#
45cb0a00 |
|
29-May-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Normalize white space in syscalls.master: try to use tabs before system call types.
|
#
d85bfefd |
|
28-May-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Mark ntp_gettime() as MSTD, since its system call path will acquire Giant if required.
|
#
d7b9187b |
|
28-May-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Mark the following compatability system calls as MCOMPAT or MCOMPAT4 based on the their simply wrapping MPSAFE implementations of existing MPSAFE system calls: getfsstat() lseek() stat() lstat() truncate() ftruncate() statfs() fstatfs() Note that ogetdirentries() is not marked MPSAFE because it does not share the MPSAFE implementation used for getdirentries(), and requires separate locking to be implemented.
|
#
160349ad |
|
28-May-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Mark quotactl() as MSTD.
|
#
ec792a67 |
|
28-May-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Mark kenv(2) as MPSAFE, since it appears to be properly locked down.
|
#
5267dc0b |
|
28-May-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Also mark the COMPAT4 version of fhstatfs() as MPSAFE.
|
#
2191a5d1 |
|
27-May-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Mark fhopen(), fhstat(), and fhstatfs() as MSTD, since they now acquire Giant themselves.
|
#
c4bd610f |
|
22-Apr-2005 |
David Xu <davidxu@FreeBSD.org> |
Add new syscall thr_new to create thread in atomic, it will inherit signal mask from parent thread, setup TLS and stack, and user entry address. Also support POSIX thread's PTHREAD_SCOPE_PROCESS and PTHREAD_SCOPE_SYSTEM, sysctl is also provided to control the scheduler scope.
|
#
b2624444 |
|
09-Mar-2005 |
Stefan Farfeleder <stefanf@FreeBSD.org> |
Fix typo in comment.
|
#
96d31285 |
|
01-Mar-2005 |
Paul Saab <ps@FreeBSD.org> |
Change the prototype of kevent to remove the const from the changelist. Reviewed by: jhb
|
#
810ad5ec |
|
25-Jan-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Struct mount is not yet locked well enough to allow mount/nmount/unmount to run without Giant. Mark them as STD here.
|
#
29ed48fc |
|
24-Jan-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Change all VFS syscalls to MSTD as they all manually deal with giant or the appropriate filesystem locks. Sponsored By: Isilon Systems, Inc.
|
#
fe0ef598 |
|
02-Jan-2005 |
Marcel Moolenaar <marcel@FreeBSD.org> |
uuidgen(2) is MP safe.
|
#
c180db2b |
|
25-Dec-2004 |
David Xu <davidxu@FreeBSD.org> |
Make _umtx_op() as more general interface, the final parameter needn't be timespec pointer, every parameter will be interpreted by its opcode.
|
#
50586e8b |
|
17-Dec-2004 |
David Xu <davidxu@FreeBSD.org> |
1. make umtx sharable between processes, the way is two or more processes call mmap() to create a shared space, and then initialize umtx on it, after that, each thread in different processes can use the umtx same as threads in same process. 2. introduce a new syscall _umtx_op to support timed lock and condition variable semantics. also, orignal umtx_lock and umtx_unlock inline functions now are reimplemented by using _umtx_op, the _umtx_op can use arbitrary id not just a thread id.
|
#
7fa77ace |
|
24-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Mark mount, unmount and nmount MPSAFE
|
#
6b270b48 |
|
18-Nov-2004 |
Mark Santcroos <marks@FreeBSD.org> |
Add ntp_gettime(2) system call. Reviewed by: imp, phk, njl, peter Approved by: njl
|
#
3e8c2449 |
|
23-Oct-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Add system call place-holders for the following system calls implementing Sun's BSM Audit API on FreeBSD: audit() auditon() getauid() setauid() getaudit() setaudit() getaudit_addr() setaudit_addr() auditctl() Submitted by: Wayne Salamon <wsalamon at computer dot org> Obtained from: TrustedBSD Project
|
#
ebfcca3d |
|
06-Oct-2004 |
David Xu <davidxu@FreeBSD.org> |
Regen to unbreak world. Pointy hat to: mtm
|
#
1a946b9f |
|
13-Jul-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add kldunloadf() system call. Stay tuned for follwing commit messages.
|
#
507b0318 |
|
12-Jul-2004 |
David Xu <davidxu@FreeBSD.org> |
Change kse_switchin to accept kse_thr_mailbox pointer, the syscall will be used heavily in debugging KSE threads. This breaks libpthread on IA64, but because libpthread was not in 5.2.1 release, I would like to change it so we needn't to introduce another syscall.
|
#
cd28f17d |
|
01-Jul-2004 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Change the thread ID (thr_id_t) used for 1:1 threading from being a pointer to the corresponding struct thread to the thread ID (lwpid_t) assigned to that thread. The primary reason for this change is that libthr now internally uses the same ID as the debugger and the kernel when referencing to a kernel thread. This allows us to implement the support for debugging without additional translations and/or mappings. To preserve the ABI, the 1:1 threading syscalls, including the umtx locking API have not been changed to work on a lwpid_t. Instead the 1:1 threading syscalls operate on long and the umtx locking API has not been changed except for the contested bit. Previously this was the least significant bit. Now it's the most significant bit. Since the contested bit should not be tested by userland, this change is not expected to be visible. Just to be sure, UMTX_CONTESTED has been removed from <sys/umtx.h>. Reviewed by: mtm@ ABI preservation tested on: i386, ia64
|
#
2ed57081 |
|
21-Jun-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Mark unlink() as MPSAFE as we now acquire Giant in the unlink() system call.
|
#
61d87ffd |
|
21-Jun-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Mark link() system call as MPSAFE.
|
#
0b0a60fb |
|
05-Apr-2004 |
Doug Rabson <dfr@FreeBSD.org> |
Add lgetfh(2) which is like getfh(2) but doesn't follow symlinks.
|
#
1713a516 |
|
27-Mar-2004 |
Mike Makonnen <mtm@FreeBSD.org> |
Separate thread synchronization from signals in libthr. Instead use msleep() and wakeup_one(). Discussed with: jhb, peter, tjr
|
#
1f325ae3 |
|
16-Mar-2004 |
David Malone <dwmalone@FreeBSD.org> |
Get ready to mark open, creat and nosys as MPSAFE.
|
#
8ac61436 |
|
15-Mar-2004 |
John Baldwin <jhb@FreeBSD.org> |
Drop the proc lock around calls to the MD functions ptrace_single_step(), ptrace_set_pc(), and cpu_ptrace() so that those functions are free to acquire Giant, sleep, etc. We already do a PHOLD/PRELE around them so that it is safe to sleep inside of these routines if necessary. This allows ptrace() to be marked MP safe again as it no longer triggers lock order reversals on Alpha. Tested by: wilko
|
#
37814395 |
|
13-Mar-2004 |
Peter Wemm <peter@FreeBSD.org> |
Push Giant down a little further: - no longer serialize on Giant for thread_single*() and family in fork, exit and exec - thread_wait() is mpsafe, assert no Giant - reduce scope of Giant in exit to not cover thread_wait and just do vm_waitproc(). - assert that thread_single() family are not called with Giant - remove the DROP/PICKUP_GIANT macros from thread_single() family - assert that thread_suspend_check() s not called with Giant - remove manual drop_giant hack in thread_suspend_check since we know it isn't held. - remove the DROP/PICKUP_GIANT macros from thread_suspend_check() family - mark kse_create() mpsafe
|
#
aae94fbb |
|
02-Feb-2004 |
Daniel Eischen <deischen@FreeBSD.org> |
Add ksem_timedwait() to complement ksem_wait(). Glanced at by: alfred
|
#
866e3b7e |
|
25-Dec-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Put restrict back in, the compilation failure was my fault when I did a bad merge from the PR. Thanks to Bruce Evans for explaining.
|
#
6502da13 |
|
24-Dec-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
We're not ready for restrict qualifiers here.
|
#
9f144cff |
|
24-Dec-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Add restrict qualifiers. PR: 44394 Submitted by: Craig Rodrigues <rodrige@attbi.com>
|
#
eec525a4 |
|
22-Dec-2003 |
Peter Wemm <peter@FreeBSD.org> |
Remove namespc column and attempt to un-fold some of the longer lines that now fit.
|
#
5352eb6b |
|
10-Dec-2003 |
Peter Wemm <peter@FreeBSD.org> |
Update file locations for syscall tables to copy to.
|
#
702b2a17 |
|
07-Dec-2003 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Add kse_switchin(2). This syscall can be used by KSE implementations to have the kernel switch to a new thread, instead of doing it in userland. It is in fact needed on ia64 where syscall restarts do not return to userland first. It's completely handled inside the kernel. As such, any context created by the kernel as part of an upcall and caused by some syscall needs to be restored by the kernel.
|
#
5c49a056 |
|
13-Nov-2003 |
Jeff Roberson <jeff@FreeBSD.org> |
- Revision 1.156 marked ptrace() SMP safe. Unfortunately, alpha implements parts of ptrace using proc_rwmem(). proc_rwmem() requires giant, and giant must be acquired prior to the proc lock, so ptrace must require giant still.
|
#
fde81c7d |
|
12-Nov-2003 |
Kirk McKusick <mckusick@FreeBSD.org> |
Update the statfs structure with 64-bit fields to allow accurate reporting of multi-terabyte filesystem sizes. You should build and boot a new kernel BEFORE doing a `make world' as the new kernel will know about binaries using the old statfs structure, but an old kernel will not know about the new system calls that support the new statfs structure. Running an old kernel after a `make world' will cause programs such as `df' that do a statfs system call to fail with a bad system call. Reviewed by: Bruce Evans <bde@zeta.org.au> Reviewed by: Tim Robbins <tjr@freebsd.org> Reviewed by: Julian Elischer <julian@elischer.org> Reviewed by: the hoards of <arch@freebsd.org> Sponsored by: DARPA & NAI Labs.
|
#
c055e5d4 |
|
07-Nov-2003 |
John Baldwin <jhb@FreeBSD.org> |
Mark ptrace(), ktrace(), utrace(), sysarch(), and issetugid() as MP safe. The parts of these calls that are not yet MP safe acquire Giant explicitly.
|
#
bd781a1e |
|
21-Oct-2003 |
Scott Long <scottl@FreeBSD.org> |
Don peril-sensitive sunglasses and mark pipe(2) as MPSAFE. I've beaten up on it for the last 15 hours with no signs of problems. It gives a small (1%) gain on buildworld since pipe_read/pipe_write are already free of Giant.
|
#
111b0d0d |
|
20-Oct-2003 |
David Malone <dwmalone@FreeBSD.org> |
Mark dup as MPSAFE. Giant was pushed into dup ages ago, but it looks like it was missed in syscalls.master. Spotted by: alc
|
#
ffe5125e |
|
06-Sep-2003 |
Alan Cox <alc@FreeBSD.org> |
msync(2) should be declared MP-safe.
|
#
dd7da9aa |
|
17-Jul-2003 |
David Xu <davidxu@FreeBSD.org> |
o Refine kse_thr_interrupt to allow it to handle different commands. o Remove TDF_NOSIGPOST. o Add a member td_waitset to proc structure, it will be used for sigwait. Tested by: deischen
|
#
9dde3bc9 |
|
28-Jun-2003 |
David Xu <davidxu@FreeBSD.org> |
o Change kse_thr_interrupt to allow send a signal to a specified thread, or unblock a thread in kernel, and allow UTS to specify whether syscall should be restarted. o Add ability for UTS to monitor signal comes in and removed from process, the flag PS_SIGEVENT is used to indicate the events. o Add a KMF_WAITSIGEVENT for KSE mailbox flag, UTS call kse_release with this flag set to wait for above signal event. o For SA based thread, kernel masks all signal in its signal mask, let UTS to use kse_thr_interrupt interrupt a thread, and install a signal frame in userland for the thread. o Add a tm_syncsig in thread mailbox, when a hardware trap occurs, it is used to deliver synchronous signal to userland, and upcall is schedule, so UTS can process the synchronous signal for the thread. Reviewed by: julian (mentor)
|
#
9e18f277 |
|
03-Jun-2003 |
Robert Watson <rwatson@FreeBSD.org> |
Add system calls to explicitly list extended attributes on a file/directory/link, rather than using a less explicit hack on the extattr retrieval API: extattr_list_fd() extattr_list_file() extattr_list_link() The existing API was counter-intuitive, and poorly documented. The prototypes for these system calls are identical to extattr_get_*(), but without a specific attribute name to leave NULL. Pointed out by: Dominic Giampaolo <dbg@apple.com> Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
fd7a8150 |
|
08-Apr-2003 |
Mike Barcroft <mike@FreeBSD.org> |
o In struct prison, add an allprison linked list of prisons (protected by allprison_mtx), a unique prison/jail identifier field, two path fields (pr_path for reporting and pr_root vnode instance) to store the chroot() point of each jail. o Add jail_attach(2) to allow a process to bind to an existing jail. o Add change_root() to perform the chroot operation on a specified vnode. o Generalize change_dir() to accept a vnode, and move namei() calls to callers of change_dir(). o Add a new sysctl (security.jail.list) which is a group of struct xprison instances that represent a snapshot of active jails. Reviewed by: rwatson, tjr
|
#
f27bf63b |
|
31-Mar-2003 |
Jeff Roberson <jeff@FreeBSD.org> |
- Mark the various thr syscalls as MP safe. Previously there was a bug if this was not done since thr_exit() unwinds giant.
|
#
6eeb9653 |
|
31-Mar-2003 |
Jeff Roberson <jeff@FreeBSD.org> |
- Include umtx.h in files generated by makesyscalls.sh - Add system calls for umtx.
|
#
8d5377e5 |
|
31-Mar-2003 |
Jeff Roberson <jeff@FreeBSD.org> |
- Add the four thr related system calls.
|
#
a447cd8b |
|
31-Mar-2003 |
Jeff Roberson <jeff@FreeBSD.org> |
- Define sigwait, sigtimedwait, and sigwaitinfo in terms of kern_sigtimedwait() which is capable of supporting all of their semantics. - These should be POSIX compliant but more careful review is needed before we announce this.
|
#
eb117d5c |
|
20-Feb-2003 |
David Xu <davidxu@FreeBSD.org> |
Add a timeout parameter to kse_release.
|
#
b17c9cfa |
|
26-Jan-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Add const qualifier to data argument for msgsnd. PR: standards/45274 Submitted by: Craig Rodrigues <rodrigc@attbi.com>
|
#
e1d7d0bb |
|
25-Jan-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Bring shm functions closer the the opengroup standards. PR: 47469 Submitted by: Craig Rodrigues <rodrigc@attbi.com>
|
#
3beb3270 |
|
25-Jan-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Bring semop() closer the the opengroup standards. PR: 47471 Submitted by: Craig Rodrigues <rodrigc@attbi.com>
|
#
cac3fba0 |
|
04-Jan-2003 |
David Xu <davidxu@FreeBSD.org> |
Some KSE syscalls are MPSAFE.
|
#
b1f4acd8 |
|
29-Dec-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Add definitions for four new system calls: __acl_get_link() Retrieve an ACL by name without following symbolic links. __acl_set_link() Set an ACL by name without following symbolic links. __acl_delete_link() Delete an ACL by name without following symbolic links. __acl_aclcheck_link() Check an ACL against a file by name without following symbolic links. These calls are similar in spirit to lstat(), lchown(), lchmod(), etc, and will be used under similar circumstances. Obtained from: TrustedBSD Project
|
#
92da00bb |
|
15-Dec-2002 |
Matthew Dillon <dillon@FreeBSD.org> |
This is David Schultz's swapoff code which I am finally able to commit. This should be considered highly experimental for the moment. Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU> MFC after: 3 weeks
|
#
2be05b70 |
|
15-Nov-2002 |
Daniel Eischen <deischen@FreeBSD.org> |
Add getcontext, setcontext, and swapcontext as system calls. Previously these were libc functions but were requested to be made into system calls for atomicity and to coalesce what might be two entrances into the kernel (signal mask setting and floating point trap) into one. A few style nits and comments from bde are also included. Tested on alpha by: gallatin
|
#
21bb9ea2 |
|
05-Nov-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Flesh out the definition of __mac_execve(): per earlier discussion, it's essentially execve() with an optional MAC label argument. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
6cedb451 |
|
01-Nov-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Rename __execve_mac() to __mac_execve() for increased consistency with other MAC system calls. Requested by: various (phk, gordont, jake, ...)
|
#
23eeeff7 |
|
25-Oct-2002 |
Peter Wemm <peter@FreeBSD.org> |
Split 4.x and 5.x signal handling so that we can keep 4.x signal handling clean and functional as 5.x evolves. This allows some of the nasty bandaids in the 5.x codepaths to be unwound. Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an anti-foot-shooting measure in place, 5.x folks need this for a while) and finish encapsulating the older stuff under COMPAT_43. Since the ancient stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *' to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn is supposed to take), add a compile time check to prevent foot shooting there too. Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc. Tested on: i386, alpha, ia64. Compiled on sparc64 (a few days ago). Approved by: re
|
#
aad1cdc8 |
|
22-Oct-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Flesh out prototypes for __mac_get_pid, __mac_get_link, and __mac_set_link, based on __mac_get_proc() except with a pid, and __mac_get_file(), __mac_set_file() except that they do not follow symlinks. First in a series of commits to flesh out the user API. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
8556393b |
|
19-Oct-2002 |
Peter Wemm <peter@FreeBSD.org> |
Stake a claim on 418 (__xstat), 419 (__xfstat), 420 (__xlstat)
|
#
c8447553 |
|
19-Oct-2002 |
Peter Wemm <peter@FreeBSD.org> |
Grab 416/417 real estate before I get burned while testing again. This is for the not-quite-ready signal/fpu abi stuff. It may not see the light of day, but I'm certainly not going to be able to validate it when getting shot in the foot due to syscall number conflicts.
|
#
bc5245d9 |
|
19-Oct-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Add a placeholder for the execve_mac() system call, similar to SELinux's execve_secure() system call, which permits a process to pass in a label for a label change during exec. This permits SELinux to change the label for the resulting exec without a race following a manual label change on the process. Because this interface uses our general purpose MAC label abstraction, we call it execve_mac(), and wrap our port of SELinux's execve_secure() around it with appropriate sid mappings. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
803cc8aa |
|
14-Oct-2002 |
Peter Wemm <peter@FreeBSD.org> |
Restore pointer that was removed in 1.128. This wasn't a merge-o.
|
#
3c4aba09 |
|
09-Oct-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Fix what looks like a merge-o from a conflict in the last commit to syscalls.master.
|
#
0d66d36f |
|
09-Oct-2002 |
Peter Wemm <peter@FreeBSD.org> |
Add a pointer to the alternate syscall tables on 64 bit platforms.
|
#
8b10835c |
|
09-Oct-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Flesh out the extattr_{delete,get,set}_link() system calls: variations on the _file() theme that do not follow symlinks. Sync to MAC tree. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
36a8dac1 |
|
02-Oct-2002 |
Archie Cobbs <archie@FreeBSD.org> |
Let kse_wakeup() take a KSE mailbox pointer argument. Reviewed by: julian
|
#
4499985e |
|
30-Sep-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Reserve system call numbers for the following system calls: __mac_get_pid Retrieve MAC label of a process by pid Similar to __mac_get_proc() except that the target process of the operation is explicitly specified rather than assuming curthread. __mac_get_link Retrieve MAC label of a path with NOFOLLOW __mac_set_link Set MAC label of a path with NOFOLLOW extattr_set_link Set EAs on a path with NOFOLLOW extattr_get_link Retrieve EAs on a path with NOFOLLOW extattr_delete_link Delete EAs on a path with NOFOLLOW These calls are similar to __mac_get_file(), __mac_set_file(), extattr_set_file(), extattr_get_file(), and extattr_delete_file(), except that they do not follow symlinks. The distinction between these calls is similar to lchown() vs chown(). Implementations to follow. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
89def71c |
|
25-Sep-2002 |
Archie Cobbs <archie@FreeBSD.org> |
Make the following name changes to KSE related functions, etc., to better represent their purpose and minimize namespace conflicts: kse_fn_t -> kse_func_t struct thread_mailbox -> struct kse_thr_mailbox thread_interrupt() -> kse_thr_interrupt() kse_yield() -> kse_release() kse_new() -> kse_create() Add missing declaration of kse_thr_interrupt() to <sys/kse.h>. Regenerate the various generated syscall files. Minor style fixes. Reviewed by: julian
|
#
6d5dec35 |
|
18-Sep-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Add the rest of the kernel support for the sem_ API in kern/uipc_sem.c. Option 'P1003_1B_SEMAPHORES' to compile them in, or load the "sem" module to activate them. Have kern/makesyscalls.sh emit an include for sys/_semaphore.h into sysproto.h to pull in the typedef for semid_t. Add the syscalls to the syscall table as module stubs.
|
#
f61b8549 |
|
19-Aug-2002 |
Robert Watson <rwatson@FreeBSD.org> |
mac_syscall is now implemented, switch to MSTD. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
280f0785 |
|
06-Aug-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Rename mac_policy() to mac_syscall() to be more reflective of its purpose. Submitted by: cvance@tislabs.com Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
55fb7830 |
|
30-Jul-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Introduce support for Mandatory Access Control and extensible kernel access control. Replace 'void *' with 'struct mac *' now that mac.h is in the base tree. The current POSIX.1e-derived userland MAC interface is schedule for replacement, but will act as a functional placeholder until the replacement is done. These system calls allow userland processes to get and set labels on both the current process, as well as file system objects and file descriptor backed objects.
|
#
aedbd622 |
|
30-Jul-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Introduce a mac_policy() system call that will provide MAC policies with a general purpose front end entry point for user applications to invoke. The MAC framework will route the system call to the appropriate policy by name. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
5d37d00a |
|
29-Jul-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Prototype function arguments, only with MAC-specific structures replaced with void until we bring in the actual structure definitions. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
8a32e0c9 |
|
13-Jul-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove incorrect comment about now corrected manpage.
|
#
9c341296 |
|
12-Jul-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Create a bug-for-bug FreeBSD4 compatible version of sendfile and move the fixed sendfile over. This is needed to preserve binary compatibility from 4.x to 5.x.
|
#
e602ba25 |
|
29-Jun-2002 |
Julian Elischer <julian@FreeBSD.org> |
Part 1 of KSE-III The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
#
65772a1a |
|
13-Jun-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Keep POSIX.1e capabilities system call placeholders, but remove definitions.
|
#
494eefd8 |
|
27-May-2002 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Add syscall uuidgen() for generating Univerally Unique Identifiers (UUIDs). On ia64 UUIDs, aka GUIDs, are used by EFI and the firmware among others. To create GUID Partition Tables (GPTs), we need to be able to generate UUIDs.
|
#
8d9b781f |
|
05-May-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Add an entry for the lchflags(2) syscall. It's useful to prevent a symlink deletion. Reviewed by: rwatson
|
#
fd448168 |
|
17-Apr-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Add an entry for the kenv(2) syscall (code to follow). Reviewed by: peter
|
#
b0d97980 |
|
13-Apr-2002 |
Alan Cox <alc@FreeBSD.org> |
Remove the requirement that Giant be held around sigreturn().
|
#
a0805f6f |
|
11-Apr-2002 |
Alan Cox <alc@FreeBSD.org> |
Remove the requirement that Giant be held around osigreturn(). All platform- specific implementations are MPSAFE.
|
#
11ffd032 |
|
05-Mar-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Reserve system call numbers for the MAC framework. This will prevent people working on the MAC tree from getting toasted whenever system call numbers are allocated in the main tree (for example, for KSE :-). Calls allocated: __mac_{get,set}_proc, __mac_{get,set}_{fd,file}(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
c28841c1 |
|
18-Feb-2002 |
Julian Elischer <julian@FreeBSD.org> |
Add stub syscalls and definitions for KSE calls. "Book'em Danno"
|
#
8a2c87e7 |
|
18-Feb-2002 |
Julian Elischer <julian@FreeBSD.org> |
Add 5 KSE syscalls. Two will be implemented with the next KSE step and the others are reservations for coming code. All will be stubbed in this kernel in the next commit. This will allow people to easily make KSE binaries for userland testing (the syscalls will be in libc) but they will still need a real KSE kernel to test it. (libc looks in /sys to decide what it should add stubs for).
|
#
bc874287 |
|
17-Feb-2002 |
Daniel Eischen <deischen@FreeBSD.org> |
Fix prototype to sigreturn to use struct __ucontext instead of ucontext_t.
|
#
74237f55 |
|
09-Feb-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Part I: Update extended attribute API and ABI: o Modify the system call syntax for extattr_{get,set}_{fd,file}() so as not to use the scatter gather API (which appeared not to be used by any consumers, and be less portable), rather, accepts 'data' and 'nbytes' in the style of other simple read/write interfaces. This changes the API and ABI. o Modify system call semantics so that extattr_get_{fd,file}() return a size_t. When performing a read, the number of bytes read will be returned, unless the data pointer is NULL, in which case the number of bytes of data are returned. This changes the API only. o Modify the VOP_GETEXTATTR() vnode operation to accept a *size_t argument so as to return the size, if desirable. If set to NULL, the size will not be returned. o Update various filesystems (pseodofs, ufs) to DTRT. These changes should make extended attributes more useful and more portable. More commits to rebuild the system call files, as well as update userland utilities to follow. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
#
860965f1 |
|
01-Feb-2002 |
Bruce Evans <bde@FreeBSD.org> |
Made osigreturn(2) standard so that SYS_osigreturn can be used in the signal trampoline for old signals. The arches that support old signals currently abuse sigreturn(2) instead. This mainly complicates things and slightly breaks the the new sigreturn(2). COMPAT is too limited to support the correct configuration of osigreturn, and this commit doesn't attempt to fix it; it just moves the bogusness: osigreturn() must now be provided unconditionally even on arches that don't really need it; previously it had to be provided under the bogus condition defined(COMPAT_43).
|
#
21d56e9c |
|
29-Dec-2001 |
Alfred Perlstein <alfred@FreeBSD.org> |
Make AIO a loadable module. Remove the explicit call to aio_proc_rundown() from exit1(), instead AIO will use at_exit(9). Add functions at_exec(9), rm_at_exec(9) which function nearly the same as at_exec(9) and rm_at_exec(9), these functions are called on behalf of modules at the time of execve(2) after the image activator has run. Use a modified version of tegge's suggestion via at_exec(9) to close an exploitable race in AIO. Fix SYSCALL_MODULE_HELPER such that it's archetecuterally neutral, the problem was that one had to pass it a paramater indicating the number of arguments which were actually the number of "int". Fix it by using an inline version of the AS macro against the syscall arguments. (AS should be available globally but we'll get to that later.) Add a primative system for dynamically adding kqueue ops, it's really not as sophisticated as it should be, but I'll discuss with jlemon when he's around.
|
#
c60693db |
|
02-Nov-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Reserve 378 for the new mount syscall Maxime Henrion <mux@qualys.com> is working on. (This is to get us more than 32 mountoptions).
|
#
b55abfd9 |
|
13-Oct-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Reserve system call 377 for afs_syscall; by reserving a system call number, portable OpenAFS applications don't have to attempt to determine what system call number was dynamically allocated. No system call prototype or implementation is defined. Requested by: Tom Maher <tardis@watson.org>
|
#
9c94f773 |
|
21-Sep-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Introduce eaccess(2), a version of access(2) that uses the effective credentials rather than the real credentials. This is useful for implementing GUI's which need to modify icons based on access rights, but where use of open(2) is too expensive, use of stat(2) doesn't reflect the file system's real protection model, and use of access() suffers from real/effective credential confusion. This implementation provides the same semantics as the call of the same name on SCO OpenServer. Note: using this call improperly can leave you subject to some of the same races present in the access(2) call. o To implement this, break out the basic logic of access(2) into vpaccess(), which accepts a passed credential to perform the invocation of VOP_ACCESS(). Add eaccess(2) to invoke vpaccess(), and modify access(2) to use vpaccess(). Obtained from: TrustedBSD Project
|
#
eb25edbd |
|
18-Sep-2001 |
Peter Wemm <peter@FreeBSD.org> |
Cleanup and split of nfs client and server code. This builds on the top of several repo-copies.
|
#
257d1988 |
|
01-Sep-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Synchronize syscalls.master(s) with recent Giant pushdown work
|
#
918c3b13 |
|
31-Aug-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Make yield() MPSAFE. Synchronize syscalls.master with all MPSAFE changes to date. Synchronize new syscall generation follows because yield() will panic if it is out of sync with syscalls.master.
|
#
df998760 |
|
30-Aug-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Giant pushdown syscalls in kern/uipc_syscalls.c. Affected calls: recvmsg(), sendmsg(), recvfrom(), accept(), getpeername(), getsockname(), socket(), connect(), accept(), send(), recv(), bind(), setsockopt(), listen(), sendto(), shutdown(), socketpair(), sendfile()
|
#
b6a4b4f9 |
|
30-Aug-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Giant Pushdown: sysv shm, sem, and msg calls.
|
#
356861db |
|
30-Aug-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Remove the MPSAFE keyword from the parser for syscalls.master. Instead introduce the [M] prefix to existing keywords. e.g. MSTD is the MP SAFE version of STD. This is prepatory for a massive Giant lock pushdown. The old MPSAFE keyword made syscalls.master too messy. Begin comments MP-Safe procedures with the comment: /* * MPSAFE */ This comments means that the procedure may be called without Giant held (The procedure itself may still need to obtain Giant temporarily to do its thing). sv_prepsyscall() is now MP SAFE and assumed to be MP SAFE sv_transtrap() is now MP SAFE and assumed to be MP SAFE ktrsyscall() and ktrsysret() are now MP SAFE (Giant Pushdown) trapsignal() is now MP SAFE (Giant Pushdown) Places which used to do the if (mtx_owned(&Giant)) mtx_unlock(&Giant) test in syscall[2]() in */*/trap.c now do not. Instead they explicitly unlock Giant if they previously obtained it, and then assert that it is no longer held to catch broken system calls. Rebuild syscall tables.
|
#
b6343691 |
|
29-May-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove a comment which was past its shelf life. PR: 18750 Submitted by: Tony Finch <dot@dotat.at>
|
#
23955314 |
|
18-May-2001 |
Alfred Perlstein <alfred@FreeBSD.org> |
Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
|
#
b4b469e6 |
|
11-May-2001 |
Tor Egge <tegge@FreeBSD.org> |
gettimeofday() is MP safe on both -current and -stable.
|
#
130d0157 |
|
11-Apr-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Introduce a new system call, __setsugid(), which allows a process to toggle the P_SUGID bit explicitly, rather than relying on it being set implicitly by other protection and credential logic. This feature is introduced to support inter-process authorization regression testing by simplifying userland credential management allowing the easy isolation and reproduction of authorization events with specific security contexts. This feature is enabled only by "options REGRESSION" and is not intended to be used by applications. While the feature is not known to introduce security vulnerabilities, it does allow processes to enter previously inaccessible parts of the credential state machine, and is therefore disabled by default. It may not constitute a risk, and therefore in the future pending further analysis (and appropriate need) may become a published interface. Obtained from: TrustedBSD Project
|
#
fec605c8 |
|
31-Mar-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Introduce extattr_{delete,get,set}_fd() to allow extended attribute operations on file descriptors, which complement the existing set of calls, extattr_{delete,get,set}_file() which act on paths. In doing so, restructure the system call implementation such that the two sets of functions share most of the relevant code, rather than duplicating it. This pushes the vnode locking into the shared code, but keeps the copying in of some arguments in the system call code. Allowing access via file descriptors reduces the opportunity for race conditions when managing extended attributes. Obtained from: TrustedBSD Project
|
#
30632071 |
|
18-Mar-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Rename "namespace" argument to "attrnamespace" as namespace is a C++ reserved word. Submitted by: jkh Obtained from: TrustedBSD Project
|
#
70f36851 |
|
14-Mar-2001 |
Robert Watson <rwatson@FreeBSD.org> |
o Change the API and ABI of the Extended Attribute kernel interfaces to introduce a new argument, "namespace", rather than relying on a first- character namespace indicator. This is in line with more recent thinking on EA interfaces on various mailing lists, including the posix1e, Linux acl-devel, and trustedbsd-discuss forums. Two namespaces are defined by default, EXTATTR_NAMESPACE_SYSTEM and EXTATTR_NAMESPACE_USER, where the primary distinction lies in the access control model: user EAs are accessible based on the normal MAC and DAC file/directory protections, and system attributes are limited to kernel-originated or appropriately privileged userland requests. o These API changes occur at several levels: the namespace argument is introduced in the extattr_{get,set}_file() system call interfaces, at the vnode operation level in the vop_{get,set}extattr() interfaces, and in the UFS extended attribute implementation. Changes are also introduced in the VFS extattrctl() interface (system call, VFS, and UFS implementation), where the arguments are modified to include a namespace field, as well as modified to advoid direct access to userspace variables from below the VFS layer (in the style of recent changes to mount by adrian@FreeBSD.org). This required some cleanup and bug fixing regarding VFS locks and the VFS interface, as a vnode pointer may now be optionally submitted to the VFS_EXTATTRCTL() call. Updated documentation for the VFS interface will be committed shortly. o In the near future, the auto-starting feature will be updated to search two sub-directories to the ".attribute" directory in appropriate file systems: "user" and "system" to locate attributes intended for those namespaces, as the single filename is no longer sufficient to indicate what namespace the attribute is intended for. Until this is committed, all attributes auto-started by UFS will be placed in the EXTATTR_NAMESPACE_SYSTEM namespace. o The default POSIX.1e attribute names for ACLs and Capabilities have been updated to no longer include the '$' in their filename. As such, if you're using these features, you'll need to rename the attribute backing files to the same names without '$' symbols in front. o Note that these changes will require changes in userland, which will be committed shortly. These include modifications to the extended attribute utilities, as well as to libutil for new namespace string conversion routines. Once the matching userland changes are committed, a buildworld is recommended to update all the necessary include files and verify that the kernel and userland environments are in sync. Note: If you do not use extended attributes (most people won't), upgrading is not imperative although since the system call API has changed, the new userland extended attribute code will no longer compile with old include files. o Couple of minor cleanups while I'm there: make more code compilation conditional on FFS_EXTATTR, which should recover a bit of space on kernels running without EA's, as well as update copyright dates. Obtained from: TrustedBSD Project
|
#
86360fee |
|
01-Dec-2000 |
Jake Burkholder <jake@FreeBSD.org> |
Remove thr_sleep and thr_wakeup. Remove fields p_nthread and p_wakeup from struct proc, which are now unused (p_nthread already was). Remove process flag P_KTHREADP which was untested and only set in vfs_aio.c (it should use kthread_create). Move the yield system call to kern_synch.c as kern_threads.c has been removed completely. moral support from: alfred, jhb
|
#
78525ce3 |
|
01-Dec-2000 |
Alfred Perlstein <alfred@FreeBSD.org> |
sysvipc loadable. new syscall entry lkmressys - "reserved loadable syscall" Make syscall_register allow overwriting of such entries (lkmressys).
|
#
ae51d56c |
|
28-Aug-2000 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Fix prototypes for {o|}{g|s}etrlimit. A recent change in the Linuxulator caused this bug to trigger.
|
#
4e0f152b |
|
29-Jul-2000 |
Peter Wemm <peter@FreeBSD.org> |
Sigh. Fix SYS_exit problems. I misunderstood the significance of these trailing options.
|
#
ac2b067b |
|
28-Jul-2000 |
Peter Wemm <peter@FreeBSD.org> |
Change the 'exit()' system call to 'sys_exit()'. This avoids overlapping gcc's internal exit() prototypes and the (futile) hackery that we did to try and avoid warnings. main() was renamed for similar reasons. Remove an exit related hack from makesyscalls.sh.
|
#
a8e65b91 |
|
18-Jul-2000 |
Jonathan Lemon <jlemon@FreeBSD.org> |
Simplify kqueue API slightly. Discussed on: -arch
|
#
92eebb8a |
|
13-Jul-2000 |
Robert Watson <rwatson@FreeBSD.org> |
o Introduce syscall prototypes, stubs for __cap_{get,set}_{fd,file}, syscalls to manage capability sets on files. First of two commits. Obtained from: TrustedBSD Project
|
#
b09b66ab |
|
15-Jun-2000 |
Robert Watson <rwatson@FreeBSD.org> |
Introduce syscalls for process capability manipulation. Currently backs onto already committed stubs. Commit one of two. Reviewed by: Damned if I can remember. Many people. Obtained from: TrustedBSD Project
|
#
aa4b7eae |
|
09-May-2000 |
Bruce Evans <bde@FreeBSD.org> |
Fixed the declaration of mmap(). The crufty padding arg had the wrong type. This gave an inconsistent amount of crufty padding on i386's with 64-bit longs (8 bytes instead of 4). On alphas it gives a consistent amount of crufty padding (8 bytes) in addition to the 4 bytes of normal padding caused by passing int args as register_t's. Fixed the args struct tag for the NOPROTO syscalls (netbsd_lchown() and netbsd_msync()). The tag is currently unused for NOPROTO syscalls, so the bug has no effect, but it will be used even in the NOPROTO case to calculate sy_nargs correctly.
|
#
39e4c0c8 |
|
01-May-2000 |
Peter Wemm <peter@FreeBSD.org> |
Remove undocumented broken-as-designed semconfig() syscall.
|
#
cb679c38 |
|
16-Apr-2000 |
Jonathan Lemon <jlemon@FreeBSD.org> |
Introduce kqueue() and kevent(), a kernel event notification facility.
|
#
c01df631 |
|
03-Apr-2000 |
Alfred Perlstein <alfred@FreeBSD.org> |
Make makesyscalls.sh parse an optional field 'MPSAFE' that specifies that a syscall does not want the BGL to be grabbed automatically. Add the new MPSAFE flag to the syscalls that dillon has determined to be MPSAFE.
|
#
5134b3e9 |
|
18-Jan-2000 |
Robert Watson <rwatson@FreeBSD.org> |
Fix bde'isms in acl/extattr syscall interface, renaming syscalls to prettier (?) names, adding some const's around here, et al. Commit 1 out of 3. Reviewed by: bde
|
#
8ccd6334 |
|
16-Jan-2000 |
Peter Wemm <peter@FreeBSD.org> |
Implement setres[ug]id() and getres[ug]id(). This has been sitting in my tree for ages (~2 years) waiting for an excuse to commit it. Now Linux has implemented it and it seems that Staroffice (when using the linux_base6.1 port's libc) calls this in the linux emulator and dies in setup. The Linux emulator can call these now.
|
#
bfbbc4aa |
|
13-Jan-2000 |
Jason Evans <jasone@FreeBSD.org> |
Add aio_waitcomplete(). Make aio work correctly for socket descriptors. Make gratuitous style(9) fixes (me, not the submitter) to make the aio code more readable. PR: kern/12053 Submitted by: Chris Sedore <cmsedore@maxwell.syr.edu>
|
#
20883b0f |
|
21-Dec-1999 |
Alfred Perlstein <alfred@FreeBSD.org> |
make getfh a standard syscall instead of dependant on having NFSSERVER defined, useful for userland fileservers that want to use a filehandle type interface to the filesystem. Submitted by: Assar Westerlund assar@stacken.kth.se PR: kern/15452
|
#
ef351daa |
|
18-Dec-1999 |
Robert Watson <rwatson@FreeBSD.org> |
First pass commit to introduce new ACL and Extended Attribute system calls. The second pass commit with all the supporting code will happen shortly afterwards. Reviewed by: eivind
|
#
b08210f5 |
|
17-Nov-1999 |
Brian Somers <brian@FreeBSD.org> |
modfind(char *) -> modfind(const char *) Reminded by: dfr
|
#
b7d85123 |
|
12-Oct-1999 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Now that userland including modules don't use the osig* syscalls, make them of type COMPAT.
|
#
da3605db |
|
29-Sep-1999 |
Marcel Moolenaar <marcel@FreeBSD.org> |
sigset_t change (part 1 of 5) ----------------------------- Rename sigaction, sigprocmask, sigpending and sigsuspend to osigaction, osigprocmask, osigpending and osigsuspend (resp) and add new syscalls for them to support the new sisgset_t without breaking existing binaries. Change the prototype of sigaltstack to use the typedef stack_t instead of struct sigaltstack to reflect that it is SUSv2 compliant. Also, rename sigreturn to osigreturn and add a new syscall to support the modified stackframe. The change is caused by sigreturn operating on ucontext_t now and the fact that siginfo_t has been updated to conform to SUSv2.
|
#
c24fda81 |
|
10-Sep-1999 |
Alfred Perlstein <alfred@FreeBSD.org> |
Seperate the export check in VFS_FHTOVP, exports are now checked via VFS_CHECKEXP. Add fh(open|stat|stafs) syscalls to allow userland to query filesystems based on (network) filehandle. Obtained from: NetBSD
|
#
c3aac50f |
|
27-Aug-1999 |
Peter Wemm <peter@FreeBSD.org> |
$Id$ -> $FreeBSD$
|
#
23955079 |
|
11-Aug-1999 |
Nik Clayton <nik@FreeBSD.org> |
Add CPT_NOA, LIBCOMPAT, NODEF, NOARGS, NOPROTO, and NOIMPL to the commented list of available types. PR: docs/13007 Submitted by: Assar Westerlund <assar@sics.se>
|
#
45f26d41 |
|
05-Aug-1999 |
Jordan K. Hubbard <jkh@FreeBSD.org> |
Move syscall 180 back to where it was before and fix the incorrect comment which led me to move it in the first place.
|
#
b24eb279 |
|
04-Aug-1999 |
Jordan K. Hubbard <jkh@FreeBSD.org> |
Reserve a syscall for the arla folks. I'm assuming that since syscalls.c and init_sysent.c are checked into CVS, I should also commit the regenerated copies even though they're built by syscalls.master. Correct? Bruce? :)
|
#
f664346f |
|
13-May-1999 |
Bruce Evans <bde@FreeBSD.org> |
Fixed nonsense arg type `const caddr_t' in the prototype() for utrace(). Changed to `const void *'. utrace() is undocumented, so nothing should notice. Fixed missing consts for utrace() and ktrace() in syscalls.master. sys/ktrace.h is missing some Lite2 changes of shorts to ints.
|
#
02daf150 |
|
28-Apr-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add the jail system call.
|
#
8fe387ab |
|
04-Apr-1999 |
Dmitrij Tejblum <dt@FreeBSD.org> |
Add standard padding argument to pread and pwrite syscall. That should make them NetBSD compatible. Add parameter to fo_read and fo_write. (The only flag FOF_OFFSET mean that the offset is set in the struct uio). Factor out some common code from read/pread/write/pwrite syscalls.
|
#
4160ccd9 |
|
27-Mar-1999 |
Alan Cox <alc@FreeBSD.org> |
Added pread and pwrite. These functions are defined by the X/Open Threads Extension. (Note: We use the same syscall numbers as NetBSD.) Submitted by: John Plevyak <jplevyak@inktomi.com>
|
#
325e13dd |
|
10-Nov-1998 |
Peter Wemm <peter@FreeBSD.org> |
A kldsym(2) syscall prototype for extracting information from the in-kernel linker. This is intended to replace kvm_mkdb etc. The first version only does name->value lookups, but it's open ended. value->name lookups would probably be a good thing to do too. It's been suggested to try and connect the symbol tables to sysctl (which is probably a more flexible way of doing it if it's done right), but that is far more complex and difficult than I was ready to have a shot at.
|
#
dd0b2081 |
|
05-Nov-1998 |
David Greenman <dg@FreeBSD.org> |
Implemented zero-copy TCP/IP extensions via sendfile(2) - send a file to a stream socket. sendfile(2) is similar to implementations in HP-UX, Linux, and other systems, but the API is more extensive and addresses many of the complaints that the Apache Group and others have had with those other implementations. Thanks to Marc Slemko of the Apache Group for helping me work out the best API for this. Anyway, this has the "net" result of speeding up sends of files over TCP/IP sockets by about 10X (that is to say, uses 1/10th of the CPU cycles) when compared to a traditional read/write loop.
|
#
2e83b281 |
|
24-Aug-1998 |
Doug Rabson <dfr@FreeBSD.org> |
Fix a few syscall arguments to use size_t instead of u_int.
|
#
ecbb00a2 |
|
07-Jun-1998 |
Doug Rabson <dfr@FreeBSD.org> |
This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change. The prototype FreeBSD/alpha machdep will follow in a couple of days time.
|
#
786cf38a |
|
14-May-1998 |
Peter Wemm <peter@FreeBSD.org> |
deep-six signanosleep(). It sounded like a good idea at the time.
|
#
1f562172 |
|
10-May-1998 |
John Dyson <dyson@FreeBSD.org> |
Fix the futimes/undelete/utrace conflict with other BSD's. Note that the only common usage of utrace (the possible problem with this commit) is with malloc, so this should be a real problem. Add the various NetBSD syscalls that allow full emulation of their development environment.
|
#
8a6472b7 |
|
28-Mar-1998 |
Peter Dufault <dufault@FreeBSD.org> |
Finish _POSIX_PRIORITY_SCHEDULING. Needs P1003_1B and _KPOSIX_PRIORITY_SCHEDULING options to work. Changes: Change all "posix4" to "p1003_1b". Misnamed files are left as "posix4" until I'm told if I can simply delete them and add new ones; Add _POSIX_PRIORITY_SCHEDULING system calls for FreeBSD and Linux; Add man pages for _POSIX_PRIORITY_SCHEDULING system calls; Add options to LINT; Minor fixes to P1003_1B code during testing.
|
#
14f1d426 |
|
03-Feb-1998 |
Bruce Evans <bde@FreeBSD.org> |
Fixed type of mincore().
|
#
c5b193bf |
|
30-Jan-1998 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Retire LFS. If you want to play with it, you can find the final version of the code in the repository the tag LFS_RETIREMENT. If somebody makes LFS work again, adding it back is certainly desireable, but as it is now nobody seems to care much about it, and it has suffered considerable bitrot since its somewhat haphazard integration. R.I.P
|
#
7b778b5e |
|
23-Jan-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Make all file-system (MFS, FFS, NFS, LFS, DEVFS) related option new-style. This introduce an xxxFS_BOOT for each of the rootable filesystems. (Presently not required, but encouraged to allow a smooth move of option *FS to opt_dontuse.h later.) LFS is temporarily disabled, and will be re-enabled tomorrow.
|
#
de17eb59 |
|
01-Jan-1998 |
Alexander Langer <alex@FreeBSD.org> |
Added missing caddr_t --> void * conversions for sys/mman.h functions. Submitted by: bde
|
#
e6e21bc0 |
|
26-Oct-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add "NOIMPL" for syscalls we know what is, but don't implement as "STD". Use this for getfh & nfssvc.
|
#
7822f1c6 |
|
14-Sep-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a __getcwd() syscall. This is intentionally undocumented, but all it does is to try to figure the pwd out from the vfs namecache, and return a reversed string to it. libc:getcwd() is responsible for flipping it back.
|
#
8cb0553a |
|
13-Sep-1997 |
Peter Wemm <peter@FreeBSD.org> |
Activate poll(2) syscall
|
#
6871cc62 |
|
18-Aug-1997 |
Peter Wemm <peter@FreeBSD.org> |
SVR4/XPG-style getpgid()/getsid() syscalls.
|
#
2c1011f7 |
|
15-Jun-1997 |
John Dyson <dyson@FreeBSD.org> |
Modifications to existing files to support the initial AIO/LIO and kernel based threading support.
|
#
99f06d5c |
|
01-Jun-1997 |
Peter Wemm <peter@FreeBSD.org> |
New syscall, signanosleep(), which is a hybrid of sigsuspend(2) and nanosleep(2). It sleeps until either the time expires, or a signal permitted by the supplied mask arrives (eg: SIGALRM if appropriate)
|
#
851679e5 |
|
08-May-1997 |
Peter Wemm <peter@FreeBSD.org> |
oops. NODIDE -> NOHIDE
|
#
b6f031b7 |
|
08-May-1997 |
Peter Wemm <peter@FreeBSD.org> |
Define entries for the posix-style clock/timer syscalls including nanosleep(). Also, note some syscall conflicts with other systems and indicate slots tagged for use with other syscalls some day.
|
#
cea6c86c |
|
07-May-1997 |
Doug Rabson <dfr@FreeBSD.org> |
This is the kernel linker. To use it, you will first need to apply the patches in freefall:/home/dfr/ld.diffs to your ld sources and set BINFORMAT to aoutkld when linking the kernel. Library changes and userland utilities will appear in a later commit.
|
#
56f12a6c |
|
31-Mar-1997 |
Peter Wemm <peter@FreeBSD.org> |
issetugid is now implemented rather than reserved
|
#
4eb542c6 |
|
30-Mar-1997 |
Peter Wemm <peter@FreeBSD.org> |
Reserve 252 (poll, first in OpenBSD) Reserve 253 (issetugid, as in OpenBSD) Allocate 254 for lchown(2)
|
#
6875d254 |
|
22-Feb-1997 |
Peter Wemm <peter@FreeBSD.org> |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
996c772f |
|
09-Feb-1997 |
John Dyson <dyson@FreeBSD.org> |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
ac0ad63f |
|
16-Jan-1997 |
Bruce Evans <bde@FreeBSD.org> |
Reduced #include spam in <sys/sysproto.h> and fixed things that depended on it. makesyscalls.sh: This parsed $Id$. Fixed(?) to parse $FreeBSD$. The output is wrong when the id is not expanded in the source file. syscalls.master: Fixed declaration of sigsuspend(). There are still some bogons and spam involving sigset_t. Use `struct foo *' instead of the equivalent `foo_t *' for some nfs and lfs syscalls so that <sys/sysproto.h> doesn't depend on <sys/mount.h>.
|
#
1130b656 |
|
14-Jan-1997 |
Jordan K. Hubbard <jkh@FreeBSD.org> |
Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
e6c4b9ba |
|
19-Sep-1996 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add the utrace(caddr_t addr,size_t len) syscall, that will store the data pointed at in a ktrace file, if this process is being ktrace'ed. I'm using this to profile malloc usage. The advantage is that there is no context around this call, ie, no open file or socket, so it will work in any process, and you can decide if you want it to collect data or not.
|
#
b08f7993 |
|
20-Aug-1996 |
Sujal Patel <smpatel@FreeBSD.org> |
Remove the kernel FD_SETSIZE limit for select(). Make select()'s first argument 'int' not 'u_int'. Reviewed by: bde
|
#
edbfedac |
|
11-Mar-1996 |
Peter Wemm <peter@FreeBSD.org> |
Import 4.4BSD-Lite2 onto the vendor branch, note that in the kernel, all files are off the vendor branch, so this should not change anything. A "U" marker generally means that the file was not changed in between the 4.4Lite and Lite-2 releases, and does not need a merge. "C" generally means that there was a change. [note new unused (in this form) syscalls.conf, to be 'cvs rm'ed]
|
#
3f7efdf3 |
|
02-Mar-1996 |
Peter Wemm <peter@FreeBSD.org> |
Change the 'int len' args in the mmap/msync/mincore/etc class syscalls to 'size_t' as per bde's request.
|
#
96ac07ef |
|
23-Feb-1996 |
Peter Wemm <peter@FreeBSD.org> |
Add hooks for rfork/minherit pair, and reset args of vfork in preperation for adding the syscalls.
|
#
4f9a71f6 |
|
23-Feb-1996 |
Peter Wemm <peter@FreeBSD.org> |
Note the syscall numbers used in BSD/OS 2.x. We dont want to accidently use one of these ourselves as it'd make it harder to run their binaries. Also, remove the now-defunct #include "opt_sysvipc.h".
|
#
99cb2993 |
|
13-Jan-1996 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add an option NFS_NOSERVER which saves 100K in the install kernel (or any other kernel that uses it). Use with option NFS.
|
#
e7ae3bf0 |
|
07-Jan-1996 |
Peter Wemm <peter@FreeBSD.org> |
Remove the #ifdef SYSVSHM etc. Always call the functions, some stubs are about to go in. This is to fix the problem with the ibcs2 and linux lkm's not being able to call the sysv ipc functions unless the build is modified.
|
#
50c73f36 |
|
04-Jan-1996 |
Garrett Wollman <wollman@FreeBSD.org> |
Convert SYSV IPC to new-style options. (I hope I got everything...) The LKMs will need an extra file, to come later.
|
#
db6a20e2 |
|
03-Jan-1996 |
Garrett Wollman <wollman@FreeBSD.org> |
Converted two options over to the new scheme: USER_LDT and KTRACE.
|
#
bf4f3984 |
|
14-Dec-1995 |
Peter Wemm <peter@FreeBSD.org> |
Add the direct sysv shm/sem/msg system calls, in the same way as NetBSD. This costs very little, we gain prototypes for the calls from the linux emulator, and this is one less thing in the way of NetBSD binary support.
|
#
93915a2a |
|
11-Nov-1995 |
Bruce Evans <bde@FreeBSD.org> |
Fixed the args list for mount(). We're not ready for the BSD4.4lite2/ NetBSD interface. Increased the bogusness of the args list for mmap(). The args lists for most of the memory mapping functions are bogus. The args lists in syscalls.master are a little better than the ones in the args structs currently being used, but the improvement for mmap() changed the object code and I don't want to worry about that now. Increased the bogusness of the args list for fcntl. BSD4.4lite2/NetBSD uses `void *' instead of int for the third arg. This has the advantage of working when `void *'s are longer than ints, but requires extra bogus casts that I hope to avoid. Fixed the args list for uname. `struct outsname' seems to be a typo, not an old interface. Added comments about bogus args lists for open, mount, msync, munmap, mprotect, madvise, mincore, fcntl, semsys, msgsys and shmsys.
|
#
a932a33d |
|
07-Oct-1995 |
Steven Wallace <swallace@FreeBSD.org> |
Fix misc formatting errors in makesyscalls.sh. Add CPT_NOA type which is COMPAT with NOARGS -- do not produce argument struct in sysproto. Change accept, recvfrom, getsockname to CPT_NOA type. Fix getrlimit, setrlimit argument #2 name to struct rlimit.
|
#
f171307e |
|
07-Oct-1995 |
Steven Wallace <swallace@FreeBSD.org> |
Add new functionality to makesyscalls.sh: o optional config-file to set vars: sysnames, sysproto, sysproto_h, syshdr, syssw, syshide, syscallprefix, switchname, namesname, sysvec. o change syntax of syscalls.master entry: remove argument count. add pseudo-prototype field defining function name and arguments. o generates correct structure definitions for all system calls in sys/sysproto.h o add type NOARGS: same as STD except do not create structure in sys/sysproto.h o add type NOPROTO: same as STD except do not create structure or function prototype in sys/sysproto.h New functionality provides complete prototype definitions. Usefull for generating files for emulated systems like my new ibcs2 code. Update syscalls.master to reflect new changes. For example, read() entry now looks like: 3 STD POSIX { int ibcs2_read(int fd, char *buf, u_int nbytes); } This is similar to how NetBSD generates these files.
|
#
3cb43dbd |
|
19-Sep-1995 |
Bruce Evans <bde@FreeBSD.org> |
Generate prototypes for syscall-implementing functions. Put them in <sys/sysproto.h> and use them (so far only) in kern/init_sysent.c. Don't put $Id in generated files. kern/syscalls.master: I had to add some new fields to describe some non-orthogonal names. E.g., the args struct for the syscall-implementing function foo() is usually named `foo_args', but for getpid() it is named `args'. sys/sysent.h: sy_call_t is still incomplete to hide a couple of warnings.
|
#
e876c909 |
|
22-Apr-1995 |
Andrey A. Chernov <ache@FreeBSD.org> |
Make setreuid/setregid active syscalls
|
#
bc4c84cf |
|
25-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Added a third "flags" argument to msync() ...as other systems have.
|
#
403ef252 |
|
03-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Removed obsolete vtrace() remnants.
|
#
23f6ed01 |
|
14-Dec-1994 |
Garrett Wollman <wollman@FreeBSD.org> |
Actually enable NTP kernel PLL. (Oops!) Noticed by Pete Carah.
|
#
7216391e |
|
01-Oct-1994 |
David Greenman <dg@FreeBSD.org> |
"idle priority" support. Based on code from Henrik Vestergaard Draboel, but substantially rewritten by me.
|
#
5ea9b263 |
|
28-Sep-1994 |
Garrett Wollman <wollman@FreeBSD.org> |
LKM support is no longer optional.
|
#
3f31c649 |
|
18-Sep-1994 |
Garrett Wollman <wollman@FreeBSD.org> |
Redo Kernel NTP PLL support, kernel side. This code is mostly taken from the 1.1 port (which was in turn taken from Dave Mills's kern.tar.Z example). A few significant differences: 1) ntp_gettime() is now a MIB variable rather than a system call. A few fiddles are done in libc to make it behave the same. 2) mono_time does not participate in the PLL adjustments. 3) A new interface has been defined (in <machine/clock.h>) for doing possibly machine-dependent things around the time of the clock update. This is used in Pentium kernels to disable interrupts, set `time', and reset the CPU cycle counter as quickly as possible to avoid jitter in microtime(). Measurements show an apparent resolution of a bit more than 8.14usec, which is reasonable given system-call overhead.
|
#
3d903220 |
|
13-Sep-1994 |
Doug Rabson <dfr@FreeBSD.org> |
Added SYSV ipcs. Obtained from: NetBSD and FreeBSD-1.1.5
|
#
0960a7f0 |
|
12-Sep-1994 |
Garrett Wollman <wollman@FreeBSD.org> |
Added namespace information for future pollution-control measures.
|
#
e8fb0b2c |
|
31-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Realtime priority scheduling support. Submitted by: Henrik Vestergaard Draboel
|
#
24ea21ce |
|
26-Aug-1994 |
Garrett Wollman <wollman@FreeBSD.org> |
Added ntp_gettime and ntp_adjtime syscalls, both nosys'ed out until someone gets to re-integrating the code. ntp_gettime() should be turned into a sysctl variable and emulated in the library.
|
#
3edb235c |
|
19-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Terry Lambert's loadable kernel module support w/improvements from the NetBSD group.
|
#
3c4dd356 |
|
02-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Added $Id$
|
#
26f9a767 |
|
25-May-1994 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
df8bae1d |
|
24-May-1994 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
BSD 4.4 Lite Kernel Sources
|