History log of /freebsd-current/sys/sys/sysent.h
Revision Date Author Comments
# eb32c1c7 02-Nov-2023 Andrew Turner <andrew@FreeBSD.org>

sysent: Add sv_protect

To allow for architecture specific protections add sv_protect to struct
sysent. This can be used to apply these after the executable is loaded
into the new address space.

Reviewed by: kib
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D42440


# 7ec361d6 03-Oct-2023 Dmitry Chagin <dchagin@FreeBSD.org>

sysent: Trim trailing whitespaces

MFC after: 1 week


# b82b4ae7 25-Sep-2023 Konstantin Belousov <kib@FreeBSD.org>

sysentvec: add SV_SIGSYS flag

to allow ABIs to indicate that SIGSYS is needed. Mark all native
FreeBSD ABIs with the flag.

This implicitly marks Linux' ABIs as not delivering SIGSYS on invalid
syscall.

Reviewed by: dchagin, markj
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D41976


# 39024a89 25-Sep-2023 Konstantin Belousov <kib@FreeBSD.org>

syscalls: fix missing SIGSYS for several ENOSYS errors

In particular, when the syscall number is too large, or when syscall is
dynamic. For that, add nosys_sysent structure to pass fake sysent to
syscall top code.

Reviewed by: dchagin, markj
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D41976


# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# d706d02e 29-May-2023 Dmitry Chagin <dchagin@FreeBSD.org>

sysentvec: Retire sv_imgact_try as unneeded anymore

The sysentvec sv_imgact_try was used by kern_exec() to allow
non-native ABI to fixup shell path according to ABI root directory.
Since the non-native ABI can now specify its root directory directly
to namei() via pwd_altroot() call this facility is not needed anymore.

Differential Revision: https://reviews.freebsd.org/D40092
MFC after: 2 month


# 361971fb 02-Jun-2022 Kornel Dulęba <kd@FreeBSD.org>

Rework how shared page related data is stored

Store the shared page address in struct vmspace.
Also instead of storing absolute addresses of various shared page
segments save their offsets with respect to the shared page address.
This will be more useful when the shared page address is randomized.

Approved by: mw(mentor)
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D35393


# eca368ec 20-May-2022 Dmitry Chagin <dchagin@FreeBSD.org>

Retire sv_transtrap

Call translate_traps directly from sendsig().

MFC after: 2 weeks


# 548a2ec4 24-Jan-2022 Andrew Turner <andrew@FreeBSD.org>

Add PT_GETREGSET

This adds the PT_GETREGSET and PT_SETREGSET ptrace types. These can be
used to access all the registers from a specified core dump note type.
The NT_PRSTATUS and NT_FPREGSET notes are initially supported. Other
machine-dependant types are expected to be added in the future.

The ptrace addr points to a struct iovec pointing at memory to hold the
registers along with its length. On success the length in the iovec is
updated to tell userspace the actual length the kernel wrote or, if the
base address is NULL, the length the kernel would have written.

Because the data field is an int the arguments are backwards when
compared to the Linux PTRACE_GETREGSET call.

Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19831


# 758d98de 17-Jan-2022 Mark Johnston <markj@FreeBSD.org>

exec: Remove the stack gap implementation

ASLR stack randomization will reappear in a forthcoming commit. Rather
than inserting a random gap into the stack mapping, the entire stack
mapping itself will be randomized in the same way that other mappings
are when ASLR is enabled.

No functional change intended, as the stack gap implementation is
currently disabled by default.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33704


# 3fc21fdd 17-Jan-2022 Mark Johnston <markj@FreeBSD.org>

sysent: Add a sv_psstringssz field to struct sysentvec

The size of the ps_strings structure varies between ABIs, so this is
useful for computing the address of the ps_strings structure relative to
the top of the stack when stack address randomization is enabled.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33704


# 01c77a43 11-Nov-2021 Konstantin Belousov <kib@FreeBSD.org>

Pass vdso address to userspace

Reviewed by: emaste
Discussed with: jrtc27
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D32960


# ab4524b3 05-Nov-2021 Konstantin Belousov <kib@FreeBSD.org>

amd64: wrap 64bit sigtramp into vdso

Reviewed by: emaste
Discussed with: jrtc27
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D32960


# a8efd4d1 29-Nov-2021 Brooks Davis <brooks@FreeBSD.org>

syscalls: make syscall and __syscall SYSMUX

Rather than combining the declearation of nosys with the registration
of SYS_syscall, declare syscall(2) and __syscall(2) with the new
SYSMUX type in syscalls.master and declare nosys directly. This
eliminates the last use of syscall aliases in the tree.

Reviewed by: kib, imp


# 1c696903 20-Oct-2021 Konstantin Belousov <kib@FreeBSD.org>

Unmap shared page manually before doing vm_map_remove() on exit or exec

This allows the pmap_remove(min, max) call to see empty pmap and exploit
empty pmap optimization.

Reviewed by: markj
Tested by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32569


# 889b56c8 13-Oct-2021 Dawid Gorecki <dgr@semihalf.com>

setrlimit: Take stack gap into account.

Calling setrlimit with stack gap enabled and with low values of stack
resource limit often caused the program to abort immediately after
exiting the syscall. This happened due to the fact that the resource
limit was calculated assuming that the stack started at sv_usrstack,
while with stack gap enabled the stack is moved by a random number
of bytes.

Save information about stack size in struct vmspace and adjust the
rlim_cur value. If the rlim_cur and stack gap is bigger than rlim_max,
then the value is truncated to rlim_max.

PR: 253208
Reviewed by: kib
Obtained from: Semihalf
Sponsored by: Stormshield
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D31516


# 397f1889 12-Sep-2021 Konstantin Belousov <kib@FreeBSD.org>

Remove SV_CAPSICUM

It was only needed for cloudabi

Reviewed by: emaste
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D31923


# cf0ee873 12-Sep-2021 Konstantin Belousov <kib@FreeBSD.org>

Drop cloudabi

According to https://github.com/NuxiNL/cloudlibc:
CloudABI is no longer being maintained. It was an awesome experiment,
but it never got enough traction to be sustainable.

There is no reason to keep it in FreeBSD.

Approved by: ed (private mail)
Reviewed by: emaste
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D31923


# de8374df 12-Aug-2021 Dmitry Chagin <dchagin@FreeBSD.org>

fork: Allow ABI to specify fork return values for child.

At least Linux x86 ABI's does not use carry bit and expects that the dx register
is preserved. For this add a new sv_set_fork_retval hook and call it from cpu_fork().

Add a short comment about touching dx in x86_set_fork_retval(), for more details
see phab comments from kib@ and imp@.

Reviewed by: kib
Differential revision: https://reviews.freebsd.org/D31472
MFC after: 2 weeks


# 5fd9cd53 20-Jul-2021 Dmitry Chagin <dchagin@FreeBSD.org>

linux(4): Modify sv_onexec hook to return an error.

Temporary add stubs to the Linux emulation layer which calls the existing hook.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D30911
MFC after: 2 weeks


# 28a66fc3 01-Jul-2021 Konstantin Belousov <kib@FreeBSD.org>

Do not call FreeBSD-ABI specific code for all ABIs

Use sysentvec hooks to only call umtx_thread_exit/umtx_exec, which handle
robust mutexes, for native FreeBSD ABI. Similarly, there is no sense
in calling sigfastblock_clear() for non-native ABIs.

Requested by: dchagin
Reviewed by: dchagin, markj (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D30987


# 71ab3445 01-Jul-2021 Konstantin Belousov <kib@FreeBSD.org>

Add sv_onexec_old() sysent hook for exec event

Unlike sv_onexec(), it is called from the old (pre-exec) sysentvec structure.
The old vmspace for the process is still intact during the call.

Reviewed by: dchagin, markj
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D30987


# 435754a5 29-Jun-2021 Edward Tomasz Napierala <trasz@FreeBSD.org>

Add infrastructure required for Linux coredump support

This adds `sv_elf_core_osabi`, `sv_elf_core_abi_vendor`,
and `sv_elf_core_prepare_notes` fields to `struct sysentvec`,
and modifies imgact_elf.c to make use of them instead
of hardcoding FreeBSD-specific values. It also updates all
of the ABI definitions to preserve current behaviour.

This makes it possible to implement non-native ELF coredump
support without unnecessary code duplication. It will be used
for Linux coredumps.

Reviewed By: kib
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D30921


# a12e901a 05-Jun-2021 Konstantin Belousov <kib@FreeBSD.org>

Add a knob to disable dequeueing SIGCHLD on waiting for live process

It seems that Linux does not dequeue siginfo for SIGCHLD when wait*(2)
reports status of the running process. In particular, sigwaitinfo(2)
and other signal querying syscalls can observe the siginfo after wait.

FreeBSD dequeued siginfo from the beginning, so we cannot change the
default ABI to be more compatible. Still, add a knob to enable to
change to the other behavior for debugging purposes.

Reported by: dchagin
Reviewed by: dchagin, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30675


# bc387624 05-Jun-2021 Konstantin Belousov <kib@FreeBSD.org>

Add a knob to not drop signal with default ignored or ignored actions

Traditionally, BSD drops signals with the default action during send,
not even putting them to the destination process queue. This semantic
is not shared with other operating systems (Linux), which do queue
such signals. In particular, sigtimedwait(2) and related syscalls can
observe the delivery.

Add a global knob kern.sig_discard_ign which can be set to false to force
enqueuing of the signals with default action. Also add an ABI flag to
indicate that signals should be queued.

Note that it is not practical to run with the knob turned on, because almost
all software that care about the delivery of such signals, is aware of the
difference, and misbehaves if the signals are actually queued. The purpose
of the knob as is is to allow for easier diagnostic of the programs that
need the adjustments, to confirm the cause of problem.

Reported by: dchagin
Reviewed by: dchagin, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30675


# 62b8258a 06-Jun-2021 Konstantin Belousov <kib@FreeBSD.org>

Change the return type of sv__setid_allowed from bool to int

to please some userspace code using sys/sysent.h.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 2d423f76 14-Jan-2021 Konstantin Belousov <kib@FreeBSD.org>

sysent: allow ABI to disable setid on exec.

Reviewed by: dchagin
Tested by: trasz
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28154


# 86ffb3d1 24-Apr-2021 Konstantin Belousov <kib@FreeBSD.org>

ELF coredump: define several useful flags for the coredump operations

- SVC_ALL request dumping all map entries, including those marked as
non-dumpable
- SVC_NOCOMPRESS disallows compressing the dump regardless of the sysctl
policy
- SVC_PC_COREDUMP is provided for future use by userspace core dump
request

Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D29955


# 87a9b18d 23-Nov-2020 Konstantin Belousov <kib@FreeBSD.org>

Provide ABI modules hooks for process exec/exit and thread exit.

Exec and exit are same as corresponding eventhandler hooks.

Thread exit hook is called somewhat earlier, while thread is still
owned by the process and enough context is available. Note that the
process lock is owned when the hook is called.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27309


# a1bd83fe 08-Nov-2020 Edward Tomasz Napierala <trasz@FreeBSD.org>

Move syscall_thread_{enter,exit}() into the slow path. This is only
needed for syscalls from unloadable modules.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: EPSRC
Differential Revision: https://reviews.freebsd.org/D26988


# f8e8a06d 10-Oct-2020 Conrad Meyer <cem@FreeBSD.org>

random(4) FenestrasX: Push root seed version to arc4random(3)

Push the root seed version to userspace through the VDSO page, if
the RANDOM_FENESTRASX algorithm is enabled. Otherwise, there is no
functional change. The mechanism can be disabled with
debug.fxrng_vdso_enable=0.

arc4random(3) obtains a pointer to the root seed version published by
the kernel in the shared page at allocation time. Like arc4random(9),
it maintains its own per-process copy of the seed version corresponding
to the root seed version at the time it last rekeyed. On read requests,
the process seed version is compared with the version published in the
shared page; if they do not match, arc4random(3) reseeds from the
kernel before providing generated output.

This change does not implement the FenestrasX concept of PCPU userspace
generators seeded from a per-process base generator. That change is
left for future discussion/work.

Reviewed by: kib (previous version)
Approved by: csprng (me -- only touching FXRNG here)
Differential Revision: https://reviews.freebsd.org/D22839


# 4abea760 27-Sep-2020 Edward Tomasz Napierala <trasz@FreeBSD.org>

Shrink struct sysent from 48 to 32 bytes (on LP64; on ILP32 its probably
from 32 to 28) by shrinking some entries and reordering them.

Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D26508


# 70890254 17-Sep-2020 Edward Tomasz Napierala <trasz@FreeBSD.org>

Get rid of sv_errtbl and SV_ABI_ERRNO().

Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D26388


# 2b313da3 23-Aug-2020 Konstantin Belousov <kib@FreeBSD.org>

kern_sharedpage.c: Add exec_sysvec_init_secondary() helper.

It allows a sysent to share existing usermode data in shared page with
other sysent, assuming ABI differences are not in the layout of the
page.

Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D25273


# 3da4d19b 27-Apr-2020 John Baldwin <jhb@FreeBSD.org>

Extend support in sysctls for supporting multiple native ABIs.

This extends some of the changes in place to support reporting support
for 32-bit ABIs to permit reporting hard-float vs soft-float ABIs.

Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24542


# e8049590 28-Feb-2020 Mark Johnston <markj@FreeBSD.org>

Fix r358436 to not declare kernel symbols when _KERNEL is not defined.

Reported by: Jenkins, Michael Butler
Pointy hat: markj


# b4ffc02b 28-Feb-2020 Mark Johnston <markj@FreeBSD.org>

sy_call_t and systrace_args_func_t need to be visible to userspace.

Reported by: Jenkins


# 46994ec2 28-Feb-2020 Mark Johnston <markj@FreeBSD.org>

Fix standalone builds of systrace.ko after r357912.

Sponsored by: The FreeBSD Foundation


# 2f729243 14-Feb-2020 Mateusz Guzik <mjg@FreeBSD.org>

Merge audit and systrace checks

This further shortens the syscall routine by not having to re-check after
the system call.


# d8010b11 09-Dec-2019 John Baldwin <jhb@FreeBSD.org>

Copy out aux args after the argument and environment vectors.

Partially revert r354741 and r354754 and go back to allocating a
fixed-size chunk of stack space for the auxiliary vector. Keep
sv_copyout_auxargs but change it to accept the address at the end of
the environment vector as an input stack address and no longer
allocate room on the stack. It is now called at the end of
copyout_strings after the argv and environment vectors have been
copied out.

This should fix a regression in r354754 that broke the stack alignment
for newer Linux amd64 binaries (and probably broke Linux arm64 as
well).

Reviewed by: kib
Tested on: amd64 (native, linux64 (only linux-base-c7), and i386)
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22695


# 31174518 03-Dec-2019 John Baldwin <jhb@FreeBSD.org>

Use uintptr_t instead of register_t * for the stack base.

- Use ustringp for the location of the argv and environment strings
and allow destp to travel further down the stack for the stackgap
and auxv regions.
- Update the Linux copyout_strings variants to move destp down the
stack as was done for the native ABIs in r263349.
- Stop allocating a space for a stack gap in the Linux ABIs. This
used to hold translated system call arguments, but hasn't been used
since r159992.

Reviewed by: kib
Tested on: md64 (amd64, i386, linux64), i386 (i386, linux)
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22501


# 03b0d68c 18-Nov-2019 John Baldwin <jhb@FreeBSD.org>

Check for errors from copyout() and suword*() in sv_copyout_args/strings.

Reviewed by: brooks, kib
Tested on: amd64 (amd64, i386, linux64), i386 (i386, linux)
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22401


# e3532331 15-Nov-2019 John Baldwin <jhb@FreeBSD.org>

Add a sv_copyout_auxargs() hook in sysentvec.

Change the FreeBSD ELF ABIs to use this new hook to copyout ELF auxv
instead of doing it in the sv_fixup hook. In particular, this new
hook allows the stack space to be allocated at the same time the auxv
values are copied out to userland. This allows us to avoid wasting
space for unused auxv entries as well as not having to recalculate
where the auxv vector is by walking back up over the argv and
environment vectors.

Reviewed by: brooks, emaste
Tested on: amd64 (amd64 and i386 binaries), i386, mips, mips64
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22355


# fc83c5a7 31-Jul-2019 Konstantin Belousov <kib@FreeBSD.org>

Make randomized stack gap between strings and pointers to argv/envs.

This effectively makes the stack base on the csu _start entry
randomized.

The gap is enabled if ASLR is for the ABI is enabled, and then
kern.elf{64,32}.aslr.stack_gap specify the max percentage of the
initial stack size that can be wasted for gap. Setting it to zero
disables the gap, and max is capped at 50%.

Only amd64 for now.

Reviewed by: cem, markj
Discussed with: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21081


# 1699546d 01-Mar-2019 Edward Tomasz Napierala <trasz@FreeBSD.org>

Remove sv_pagesize, originally introduced with r100384.

In all of the architectures we have today, we always use PAGE_SIZE.
While in theory one could define different things, none of the
current architectures do, even the ones that have transitioned from
32-bit to 64-bit like i386 and arm. Some ancient mips binaries on
other systems used 8k instead of 4k, but we don't support running
those and likely never will due to their age and obscurity.

Reviewed by: imp (who also contributed the commit message)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D19280


# fa50a355 10-Feb-2019 Konstantin Belousov <kib@FreeBSD.org>

Implement Address Space Layout Randomization (ASLR)

With this change, randomization can be enabled for all non-fixed
mappings. It means that the base address for the mapping is selected
with a guaranteed amount of entropy (bits). If the mapping was
requested to be superpage aligned, the randomization honours the
superpage attributes.

Although the value of ASLR is diminshing over time as exploit authors
work out simple ASLR bypass techniques, it elimintates the trivial
exploitation of certain vulnerabilities, at least in theory. This
implementation is relatively small and happens at the correct
architectural level. Also, it is not expected to introduce
regressions in existing cases when turned off (default for now), or
cause any significant maintaince burden.

The randomization is done on a best-effort basis - that is, the
allocator falls back to a first fit strategy if fragmentation prevents
entropy injection. It is trivial to implement a strong mode where
failure to guarantee the requested amount of entropy results in
mapping request failure, but I do not consider that to be usable.

I have not fine-tuned the amount of entropy injected right now. It is
only a quantitive change that will not change the implementation. The
current amount is controlled by aslr_pages_rnd.

To not spoil coalescing optimizations, to reduce the page table
fragmentation inherent to ASLR, and to keep the transient superpage
promotion for the malloced memory, locality clustering is implemented
for anonymous private mappings, which are automatically grouped until
fragmentation kicks in. The initial location for the anon group range
is, of course, randomized. This is controlled by vm.cluster_anon,
enabled by default.

The default mode keeps the sbrk area unpopulated by other mappings,
but this can be turned off, which gives much more breathing bits on
architectures with small address space, such as i386. This is tied
with the question of following an application's hint about the mmap(2)
base address. Testing shows that ignoring the hint does not affect the
function of common applications, but I would expect more demanding
code could break. By default sbrk is preserved and mmap hints are
satisfied, which can be changed by using the
kern.elf{32,64}.aslr.honor_sbrk sysctl.

ASLR is enabled on per-ABI basis, and currently it is only allowed on
FreeBSD native i386 and amd64 (including compat 32bit) ABIs. Support
for additional architectures will be added after further testing.

Both per-process and per-image controls are implemented:
- procctl(2) adds PROC_ASLR_CTL/PROC_ASLR_STATUS;
- NT_FREEBSD_FCTL_ASLR_DISABLE feature control note bit makes it possible
to force ASLR off for the given binary. (A tool to edit the feature
control note is in development.)
Global controls are:
- kern.elf{32,64}.aslr.enable - for non-fixed mappings done by mmap(2);
- kern.elf{32,64}.aslr.pie_enable - for PIE image activation mappings;
- kern.elf{32,64}.aslr.honor_sbrk - allow to use sbrk area for mmap(2);
- vm.cluster_anon - enables anon mapping clustering.

PR: 208580 (exp runs)
Exp-runs done by: antoine
Reviewed by: markj (previous version)
Discussed with: emaste
Tested by: pho
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D5603


# a7f67fac 08-Feb-2019 Konstantin Belousov <kib@FreeBSD.org>

Normalize the declaration of i386_read_exec variable.

It is currently re-declared in sys/sysent.h which is a wrong place for
MD variable. Which causes redeclaration error with gcc when
sys/sysent.h and machine/md_var.h are included both.

Remove it from sys/sysent.h and instead include machine/md_var.h when
needed, under #ifdef for both i386 and amd64.

Reported and tested by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 628888f0 19-Dec-2018 Mateusz Guzik <mjg@FreeBSD.org>

Remove iBCS2, part2: general kernel

Reviewed by: kib (previous version)
Sponsored by: The FreeBSD Foundation


# 79ca7cbf 07-May-2018 Mateusz Guzik <mjg@FreeBSD.org>

Avoid calls to syscall_thread_enter/exit for statically defined syscalls

The entire mechanism is rarely used and is quite not performant due to
atomci ops on the syscall table. It also has added overhead for completely
unrelated syscalls.

Reduce it by avoiding the func calls if possible (which consistutes vast
majority of cases).

Provides about 3% syscall rate speed up for getuid on Broadwell.


# d552176e 27-Apr-2018 Mateusz Guzik <mjg@FreeBSD.org>

Unbreak world build after r333064

Reported by: O. Hartmann <ohartmann walstatt.org>


# 9d68f774 27-Apr-2018 Mateusz Guzik <mjg@FreeBSD.org>

systrace: track it like sdt probes

While here predict false.

Note the code is wrong (regardless of this change). Dereference of the
pointer can race with module unload. A fix would set the probe to a
nop stub instead of NULL.


# 581bf7cb 21-Feb-2018 Ed Maste <emaste@FreeBSD.org>

Use 'const int *' for sysentvec errno translation table

This allows an sv_errtbl to be read-only .rodata.

Sponsored by: Turing Robotic Industries Inc.


# b81e88d2 20-Feb-2018 Brooks Davis <brooks@FreeBSD.org>

Reduce duplication in dynamic syscall registration code.

Remove the unused syscall_(de)register() functions in favor of the
better documented and easier to use syscall_helper_(un)register(9)
functions.

The default and freebsd32 versions differed in which array of struct
sysents they used and a few missing updates to the 32-bit code as
features were added to the main code.

Reviewed by: cem
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14337


# 51369649 20-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.


# 904d8c49 20-Oct-2017 Michal Meloun <mmel@FreeBSD.org>

Add AT_HWCAP2 ELF auxiliary vector.
- allocate value for new AT_HWCAP2 auxiliary vector on all platforms.
- expand 'struct sysentvec' by new 'u_long *sv_hwcap2', in exactly
same way as for AT_HWCAP.

MFC after: 1 month
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D12699


# c2f37b92 14-Sep-2017 John Baldwin <jhb@FreeBSD.org>

Add AT_HWCAP and AT_EHDRFLAGS on all platforms.

A new 'u_long *sv_hwcap' field is added to 'struct sysentvec'. A
process ABI can set this field to point to a value holding a mask of
architecture-specific CPU feature flags. If an ABI does not wish to
supply AT_HWCAP to processes the field can be left as NULL.

The support code for AT_EHDRFLAGS was already present on all systems,
just the #define was not present. This is a step towards unifying the
AT_* constants across platforms.

Reviewed by: kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D12290


# 5cead591 14-Jul-2017 Konstantin Belousov <kib@FreeBSD.org>

Correct sysent flags for dynamically loaded syscalls.

Using the https://github.com/google/capsicum-test/ suite, the
PosixMqueue.CapModeForked test was failing due to an ECAPMODE after
calling kmq_notify(). On further inspection, the dynamically
loaded syscall entry was initialized with sy_flags zeroed out, since
SYSCALL_INIT_HELPER() left sysent.sy_flags with the default value.

Add a new helper SYSCALL{,32}_INIT_HELPER_F() which takes an
additional argument to specify the sy_flags value.

Submitted by: Siva Mahadevan <smahadevan@freebsdfoundation.org>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D11576


# 2d88da2f 12-Jun-2017 Konstantin Belousov <kib@FreeBSD.org>

Move struct syscall_args syscall arguments parameters container into
struct thread.

For all architectures, the syscall trap handlers have to allocate the
structure on the stack. The structure takes 88 bytes on 64bit arches
which is not negligible. Also, it cannot be easily found by other
code, which e.g. caused duplication of some members of the structure
to struct thread already. The change removes td_dbg_sc_code and
td_dbg_sc_nargs which were directly copied from syscall_args.

The structure is put into the copied on fork part of the struct thread
to make the syscall arguments information correct in the child after
fork.

This move will also allow several more uses shortly.

Reviewed by: jhb (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
X-Differential revision: https://reviews.freebsd.org/D11080


# 27f33312 28-Feb-2017 Konstantin Belousov <kib@FreeBSD.org>

Add some explanation for SV_TIMEKEEP flag.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days


# fbbd9655 28-Feb-2017 Warner Losh <imp@FreeBSD.org>

Renumber copyright clause 4

Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96


# 591d7b63 25-Jul-2016 Julian Elischer <julian@FreeBSD.org>

Split MAKE_SYSENT into two parts so that the initializer part can be
used separately if one wants to embed the sysent into a larger structure.

MFC after: 1 week


# 2664baad 22-May-2016 Dmitry Chagin <dchagin@FreeBSD.org>

Remove a now unused global declaration of some sysentvec struct.

MFC after: 2 week


# 5437e1d1 21-May-2016 Dmitry Chagin <dchagin@FreeBSD.org>

Add macro to convert errno and use it when appropriate.

MFC after: 1 week


# 038c7205 09-Jan-2016 Dmitry Chagin <dchagin@FreeBSD.org>

Implement vsyscall hack. Prior to 2.13 glibc uses vsyscall
instead of vdso. An upcoming linux_base-c6 needs it.

Differential Revision: https://reviews.freebsd.org/D1090

Reviewed by: kib, trasz
MFC after: 1 week


# 8ff6d9dd 16-Dec-2015 Mark Johnston <markj@FreeBSD.org>

Support an arbitrary number of arguments to DTrace syscall probes.

Rather than pushing all eight possible arguments into dtrace_probe()'s
stack frame, make the syscall_args struct for the current syscall available
via the current thread. Using a custom getargval method for the systrace
provider, this allows any syscall argument to be fetched, even in kernels
that have modified the maximum number of system call arguments.

Sponsored by: EMC / Isilon Storage Division


# 724f4b62 28-Nov-2015 Konstantin Belousov <kib@FreeBSD.org>

Remove sv_prepsyscall, sv_sigsize and sv_sigtbl members of the struct
sysent.

sv_prepsyscall is unused.

sv_sigsize and sv_sigtbl translate signal number from the FreeBSD
namespace into the ABI domain. It is only utilized on i386 for iBCS2
binaries. The issue with this approach is that signals for iBCS2 were
delivered with the FreeBSD signal frame layout, which does not follow
iBCS2. The same note is true for any other potential user if
sv_sigtbl. In other words, if ABI needs signal number translation, it
really needs custom sv_sendsig method instead.

Sponsored by: The FreeBSD Foundation


# 5e27d793 23-Nov-2015 Konstantin Belousov <kib@FreeBSD.org>

Split kerne timekeep ABI structure vdso_sv_tk out of the struct
sysentvec. This allows the timekeep data to be shared between similar
ABIs which cannot share sysentvec.

Make the timekeep_push_vdso() tick callback to the timekeep structures
instead of sysentvecs. If several sysentvec share the vdso_sv_tk
structure, we would update the userspace data several times on each
tick, without the change.

Only allocate vdso_sv_tk in the exec_sysvec_init() sysinit when
sysentvec is marked with the new SV_TIMEKEEP flag. This saves
allocation and update of unneeded vdso_sv_tk for ABIs which do not
provide userspace gettimeofday yet, which are PowerPCs arches right
now.

Make vdso_sv_tk allocator public, namely split out and export
alloc_sv_tk() and alloc_sv_tk_compat32(). ABIs which share timekeep
data now can allocate it manually and share as appropriate.

Requested by: nwhitehorn
Tested by: nwhitehorn, pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks


# 39f5ebb7 03-Aug-2015 Ed Schouten <ed@FreeBSD.org>

Add sysent flag to switch to capabilities mode on startup.

CloudABI processes should run in capabilities mode automatically. There
is no need to switch manually (e.g., by calling cap_enter()). Add a
flag, SV_CAPSICUM, that can be used to call into cap_enter() during
execve().

Reviewed by: kib


# 6e5fcd99 16-Jul-2015 Ed Schouten <ed@FreeBSD.org>

Add a sysentvec for CloudABI on x86-64.

Summary:
For CloudABI we need to put two things on the stack of new processes:
the argument data (a binary blob; not strings) and a startup data
structure. The startup data structure contains interesting things such
as a pointer to the ELF program header, the thread ID of the initial
thread, a stack smashing protection canary, and a pointer to the
argument data.

Fetching system call arguments and setting the return value is similar
to FreeBSD. The only differences are that system call 0 does not exist
and that we call into cloudabi_convert_errno() to convert the error
code. We also need this function in a couple of other places, so we'd
better reuse it here.

Reviewers: dchagin, kib

Reviewed By: kib

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D3098


# 91d1786f 24-May-2015 Dmitry Chagin <dchagin@FreeBSD.org>

In preparation for switching linuxulator to the use the native 1:1
threads add a hook for cleaning thread resources before the thread die.

Differential Revision: https://reviews.freebsd.org/D1038


# cdcf2428 01-Nov-2014 Mateusz Guzik <mjg@FreeBSD.org>

Fix up module unload for syscall_module_handler consumers.

After r273707 it was registering syscalls as static.

This fixes hwpmc module unload.

Reported by: markj


# e015b1ab 26-Oct-2014 Mateusz Guzik <mjg@FreeBSD.org>

Avoid dynamic syscall overhead for statically compiled modules.

The kernel tracks syscall users so that modules can safely unregister them.

But if the module is not unloadable or was compiled into the kernel, there is
no need to do this.

Achieve this by adding SY_THR_STATIC_KLD macro which expands to SY_THR_STATIC
during kernel build and 0 otherwise.

Reviewed by: kib (previous version)
MFC after: 2 weeks


# e7d939bd 06-Jul-2014 Marcel Moolenaar <marcel@FreeBSD.org>

Remove ia64.

This includes:
o All directories named *ia64*
o All files named *ia64*
o All ia64-specific code guarded by __ia64__
o All ia64-specific makefile logic
o Mention of ia64 in comments and documentation

This excludes:
o Everything under contrib/
o Everything under crypto/
o sys/xen/interface
o sys/sys/elf_common.h

Discussed at: BSDcan


# c1088304 23-Jun-2012 Konstantin Belousov <kib@FreeBSD.org>

Remove no longer needed forward declaration for struct sf_buf.

MFC after: 29 days


# 21c295ef 23-Jun-2012 Konstantin Belousov <kib@FreeBSD.org>

Stop updating the struct vdso_timehands from even handler executed in
the scheduled task from tc_windup(). Do it directly from tc_windup in
interrupt context [1].

Establish the permanent mapping of the shared page into the kernel
address space, avoiding the potential need to sleep waiting for
allocation of sf buffer during vdso_timehands update. As a
consequence, shared_page_write_start() and shared_page_write_end()
functions are not needed anymore.

Guess and memorize the pointers to native host and compat32 sysentvec
during initialization, to avoid the need to get shared_page_alloc_sx
lock during the update.

In tc_fill_vdso_timehands(), do not loop waiting for timehands
generation to stabilize, since vdso_timehands is written in the same
interrupt context which wrote timehands.

Requested by: mav [1]
MFC after: 29 days


# aea81038 22-Jun-2012 Konstantin Belousov <kib@FreeBSD.org>

Implement mechanism to export some kernel timekeeping data to
usermode, using shared page. The structures and functions have vdso
prefix, to indicate the intended location of the code in some future.

The versioned per-algorithm data is exported in the format of struct
vdso_timehands, which mostly repeats the content of in-kernel struct
timehands. Usermode reading of the structure can be lockless.
Compatibility export for 32bit processes on 64bit host is also
provided. Kernel also provides usermode with indication about
currently used timecounter, so that libc can fall back to syscall if
configured timecounter is unknown to usermode code.

The shared data updates are initiated both from the tc_windup(), where
a fast task is queued to do the update, and from sysctl handlers which
change timecounter. A manual override switch
kern.timecounter.fast_gettime allows to turn off the mechanism.

Only x86 architectures export the real algorithm data, and there, only
for tsc timecounter. HPET counters page could be exported as well, but
I prefer to not further glue the kernel and libc ABI there until
proper vdso-based solution is developed.

Minimal stubs neccessary for non-x86 architectures to still compile
are provided.

Discussed with: bde
Reviewed by: jhb
Tested by: flo
MFC after: 1 month


# a9d8437c 22-Jun-2012 Konstantin Belousov <kib@FreeBSD.org>

Enchance the shared page chunk allocator.

Do not rely on the busy state of the page from which we allocate the
chunk, to protect allocator state. Use statically allocated sx lock
instead.

Provide more flexible KPI. In particular, allow to allocate chunk
without providing initial data, and allow writes into existing
allocation. Allow to get an sf buf which temporary maps the chunk, to
allow sequential updates to shared page content without unmapping in
between.

Reviewed by: jhb
Tested by: flo
MFC after: 1 month


# 126b36a2 14-Oct-2011 Konstantin Belousov <kib@FreeBSD.org>

Control the execution permission of the readable segments for
i386 binaries on the amd64 and ia64 with the sysctl, instead of
unconditionally enabling it.

Reviewed by: marcel


# 8451d0dd 16-Sep-2011 Kip Macy <kmacy@FreeBSD.org>

In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by: rwatson
Approved by: re (bz)


# e5d81ef1 08-Mar-2011 Dmitry Chagin <dchagin@FreeBSD.org>

Extend struct sysvec with new method sv_schedtail, which is used for an
explicit process at fork trampoline path instead of eventhadler(schedtail)
invocation for each child process.

Remove eventhandler(schedtail) code and change linux ABI to use newly added
sysvec method.

While here replace explicit comparing of module sysentvec structure with the
newly created process sysentvec to detect the linux ABI.

Discussed with: kib

MFC after: 2 Week


# 08e6d9fa 01-Mar-2011 Robert Watson <rwatson@FreeBSD.org>

Continue to introduce Capsicum Capability Mode support:

Add a new system call flag, SYF_CAPENABLED, which indicates that a
particular system call is available in capability mode.

Add a new configuration file, kern/capabilities.conf (similar files
may be introduced for other ABIs in the future), which enumerates
system calls that are available in capability mode. When a new
system call is added to syscalls.master, it will also need to be
added here (if needed). Teach sysent parts to use this file to set
values for SYF_CAPENABLED for the native ABI.

Reviewed by: anderson
Discussed with: benl, kris, pjd
Obtained from: Capsicum Project
MFC after: 3 months


# a5c1afad 26-Jan-2011 Dmitry Chagin <dchagin@FreeBSD.org>

Add macro to test the sv_flags of any process. Change some places to test
the flags instead of explicit comparing with address of known sysentvec
structures.

MFC after: 1 month


# 6297a3d8 08-Jan-2011 Konstantin Belousov <kib@FreeBSD.org>

Create shared (readonly) page. Each ABI may specify the use of page by
setting SV_SHP flag and providing pointer to the vm object and mapping
address. Provide simple allocator to carve space in the page, tailored
to put the code with alignment restrictions.

Enable shared page use for amd64, both native and 32bit FreeBSD
binaries. Page is private mapped at the top of the user address
space, moving a start of the stack one page down. Move signal
trampoline code from the top of the stack to the shared page.

Reviewed by: alc


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# c6f5742f 22-Aug-2010 Rui Paulo <rpaulo@FreeBSD.org>

Kernel DTrace support for:
o uregs (sson@)
o ustack (sson@)
o /dev/dtrace/helper device (needed for USDT probes)

The work done by me was:
Sponsored by: The FreeBSD Foundation


# 5ed35ba9 29-Jun-2010 Konstantin Belousov <kib@FreeBSD.org>

Revert r209578:
Use C99 initializers for the struct sysent generated by MAKE_SYSENT().
C++ does not have designator-initializer facility of C99, not using this
in the header makes us friendly to C++ kernel modules, whoever wants
such schism.

Requested by: mdf
MFC after: 6 days (not really)


# 153ac44c 28-Jun-2010 Konstantin Belousov <kib@FreeBSD.org>

Count number of threads that enter and leave dynamically registered
syscalls. On the dynamic syscall deregistration, wait until all
threads leave the syscall code. This somewhat increases the safety
of the loadable modules unloading.

Reviewed by: jhb
Tested by: pho
MFC after: 1 month


# dc2db34f 28-Jun-2010 Konstantin Belousov <kib@FreeBSD.org>

Use C99 initializers for the struct sysent generated by MAKE_SYSENT().

MFC after: 1 week


# b2318c28 26-May-2010 Konstantin Belousov <kib@FreeBSD.org>

Allow to use syscallname(9) outside subr_trap.c.

MFC after: 1 month


# afe1a688 23-May-2010 Konstantin Belousov <kib@FreeBSD.org>

Reorganize syscall entry and leave handling.

Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
usermode into struct syscall_args. The structure is machine-depended
(this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
from the syscall. It is a generalization of
cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
return value.
sv_syscallnames - the table of syscall names.

Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().

The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.

Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().

Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively. The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.

The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.

Reviewed by: jhb, marcel, marius, nwhitehorn, stas
Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
stas (mips)
MFC after: 1 month


# 0272ddd8 07-Apr-2010 Konstantin Belousov <kib@FreeBSD.org>

MFC r205321:
Introduce SYSCALL_INIT_HELPER and SYSCALL32_INIT_HELPER macros and
neccessary support functions to allow registering dynamically loaded
syscalls from the MOD_LOAD handlers. Helpers handle registration
failures semi-automatically.


# ba4a8a92 07-Apr-2010 Konstantin Belousov <kib@FreeBSD.org>

MFC r205320:
For SYSCALL_MODULE_HELPER, use "sys/<syscallname>" module name.
For SYSCALL32_MODULE_HELPER, use "sys32/<syscallname>" module name.
This avoids modules name conflict when compat32 syscall does not
need shims.


# a107d8aa 25-Mar-2010 Nathan Whitehorn <nwhitehorn@FreeBSD.org>

Change the arguments of exec_setregs() so that it receives a pointer
to the image_params struct instead of several members of that struct
individually. This makes it easier to expand its arguments in the future
without touching all platforms.

Reviewed by: jhb


# 0687ba3e 19-Mar-2010 Konstantin Belousov <kib@FreeBSD.org>

Introduce SYSCALL_INIT_HELPER and SYSCALL32_INIT_HELPER macros and
neccessary support functions to allow registering dynamically loaded
syscalls from the MOD_LOAD handlers. Helpers handle registration
failures semi-automatically.

Reviewed by: jhb
MFC after: 2 weeks


# 99b331a9 19-Mar-2010 Konstantin Belousov <kib@FreeBSD.org>

FOr SYSCALL_MODULE_HELPER, use "sys/<syscallname>" module name.
FOr SYSCALL32_MODULE_HELPER, use "sys32/<syscallname>" module name.
This avoids modules name conflict when compat32 syscall does not
need shims.

Note that SYSCALL_MODULE_HELPER is going to be unused in the tree by
several next commits.

Suggested by: jhb
MFC after: 2 weeks


# e7228204 01-Mar-2010 Alfred Perlstein <alfred@FreeBSD.org>

Merge projects/enhanced_coredumps (r204346) into HEAD:

Enhanced process coredump routines.

This brings in the following features:
1) Limit number of cores per process via the %I coredump formatter.
Example:
if corefilename is set to %N.%I.core AND num_cores = 3, then
if a process "rpd" cores, then the corefile will be named
"rpd.0.core", however if it cores again, then the kernel will
generate "rpd.1.core" until we hit the limit of "num_cores".

this is useful to get several corefiles, but also prevent filling
the machine with corefiles.

2) Encode machine hostname in core dump name via %H.

3) Compress coredumps, useful for embedded platforms with limited space.
A sysctl kern.compress_user_cores is made available if turned on.

To enable compressed coredumps, the following config options need to be set:
options COMPRESS_USER_CORES
device zlib # brings in the zlib requirements.
device gzio # brings in the kernel vnode gzip output module.

4) Eventhandlers are fired to indicate coredumps in progress.

5) The imgact sv_coredump routine has grown a flag to pass in more
state, currently this is used only for passing a flag down to compress
the coredump or not.

Note that the gzio facility can be used for generic output of gzip'd
streams via vnodes.

Obtained from: Juniper Networks
Reviewed by: kan


# 46c10f27 01-Jun-2009 Robert Watson <rwatson@FreeBSD.org>

Add 'sy_flags', a currently unused per-syscall entry flags field that will
see future use in 9-CURRENT and 8-STABLE for features such as the
capability-mode enable flag and pay-as-you-audit.

Discussed with: jhb, sson


# b4cf0e62 21-Nov-2008 Konstantin Belousov <kib@FreeBSD.org>

Add sv_flags field to struct sysentvec with intention to provide description
of the ABI of the currently executing image. Change some places to test
the flags instead of explicit comparing with address of known sysentvec
structures to determine ABI features.

Discussed with: dchagin, imp, jhb, peter


# f5a97d1b 05-Nov-2008 Craig Rodrigues <rodrigc@FreeBSD.org>

Merge latest DTrace changes from Perforce.


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 48a43ae8 25-Sep-2008 John Baldwin <jhb@FreeBSD.org>

Tidy up a few things with syscall generation:
- Instead of using a syscall slot (370) just to get a function prototype
for lkmressys(), add an explicit function prototype to <sys/sysent.h>.
This also removes unused special case checks for 'lkmressys' from
makesyscalls.sh.
- Instead of having magic logic in makesyscalls.sh to only generate a
function prototype the first time 'lkmnosys' is seen, make 'NODEF'
always not generate a function prototype and include an explicit
prototype for 'lkmnosys' in <sys/sysent.h>.
- As a result of the fix in (2), update the LKM syscall entries in
the freebsd32 syscall table to use 'lkmnosys' rather than 'nosys'.
- Use NOPROTO for the __syscall() entry (198) in the native ABI. This
avoids the need for magic logic in makesyscalls.h to only generate
a function prototype the first time 'nosys' is encountered.


# 4e63215b 18-Sep-2008 John Baldwin <jhb@FreeBSD.org>

Whitespace fixes. This file also had 7 space indent in a few places.


# 59d8f3ff 12-Jul-2007 John Baldwin <jhb@FreeBSD.org>

Fix a couple of issues with the stack limit for 32-bit processes on 64-bit
kernels exposed by the recent fixes to resource limits for 32-bit processes
on 64-bit kernels:
- Let ABIs expose their maximum stack size via a new pointer in sysentvec
and use that in preference to maxssiz during exec() rather than always
using maxssiz for all processses.
- Apply the ABI's limit fixup to the previous stack size when adjusting
RLIMIT_STACK to determine if the existing mapping for the stack needs to
be grown or shrunk (as well as how much it should be grown or shrunk).

Approved by: re (kensmith)


# 19059a13 14-May-2007 John Baldwin <jhb@FreeBSD.org>

Rework the support for ABIs to override resource limits (used by 32-bit
processes under 64-bit kernels). Previously, each 32-bit process overwrote
its resource limits at exec() time. The problem with this approach is that
the new limits affect all child processes of the 32-bit process, including
if the child process forks and execs a 64-bit process. To fix this, don't
ovewrite the resource limits during exec(). Instead, sv_fixlimits() is
now replaced with a different function sv_fixlimit() which asks the ABI to
sanitize a single resource limit. We then use this when querying and
setting resource limits. Thus, if a 32-bit process sets a limit, then
that new limit will be inherited by future children. However, if the
32-bit process doesn't change a limit, then a future 64-bit child will
see the "full" 64-bit limit rather than the 32-bit limit.

MFC is tentative since it will break the ABI of old linux.ko modules (no
other modules are affected).

MFC after: 1 week


# ddda35b8 02-Apr-2007 John Baldwin <jhb@FreeBSD.org>

- Split out the part of SYSCALL_MODULE_HELPER() that builds a 'struct
sysent' for a new system call into a new MAKE_SYSENT() macro.
- Use MAKE_SYSENT() to build a full sysent for the nfssvc system call in
the NFS server and use syscall_register() and syscall_deregister() to
manage the nfssvc system call entry instead of manually frobbing the
sysent[] array.


# 22a09fe4 20-Dec-2006 Jung-uk Kim <jkim@FreeBSD.org>

MFP4: (part of) 109714

Add SYSCALL_MODULE_PRESENT() macro. The idea was borrowed from
syscall_register().


# 57d6c87c 15-Aug-2006 John Baldwin <jhb@FreeBSD.org>

Use SYS_AUE_<syscallname> to include the appropriate audit event identifier
for syscalls in kld's, even when compiled into the kernel statically.
Note that since this hardcodes the SYS_ prefix SYSCALL_MODULE_HELPER() now
only works for native ABI system calls. Those are the only ones that
used the macro anyway, and I chose to not require a second argument to the
macro to specify the prefix or audit event directly.


# d80c6996 02-Aug-2006 John Birrell <jb@FreeBSD.org>

Add fields to struct sysent to support the DTrace syscall provider called
systrace.

Another file called systrace_args.c is generated. This will be compiled
into systrace and is used to map the syscall arguments into the 64-bit
parameter array.


# 03e161fd 01-Aug-2006 John Baldwin <jhb@FreeBSD.org>

Make system call modules a bit more robust:
- If we fail to register the system call during MOD_LOAD, then note that
so that we don't try to deregister it or invoke the chained event handler
during the subsequent MOD_UNLOAD event. Doing the deregister when the
register failed could result in trashing system call entries.
- Add a SI_SUB_SYSCALLS just before starting up init and use that to
register syscall modules instead of SI_SUB_DRIVERS. Registering system
calls as late as possible increases the chances that any other module
event handlers or SYSINITs in a module are executed to initialize the
data in a kld before a syscall dependent on that data is able to be
invoked.

MFC after: 3 days


# cb76d9b0 28-Jul-2006 John Baldwin <jhb@FreeBSD.org>

Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is
now back to just being an argument count.


# 1471f287 02-Nov-2005 Paul Saab <ps@FreeBSD.org>

Calling setrlimit from 32bit apps could potentially increase certain
limits beyond what should be capiable in a 32bit process, so we
must fixup the limits.

Reviewed by: jhb


# 9104847f 13-Oct-2005 David Xu <davidxu@FreeBSD.org>

1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
sendsig use discrete parameters, now they uses member fields of
ksiginfo_t structure. For sendsig, this change allows us to pass
POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
an extra siginfo list holds additional siginfo_t data for signals.
kernel code uses psignal() still behavior as before, it won't be failed
even under memory pressure, only exception is when deleting a signal,
we should call sigqueue_delete to remove signal from sigqueue but
not SIGDELSET. Current there is no kernel code will deliver a signal
with additional data, so kernel should be as stable as before,
a ksiginfo can carry more information, for example, allow signal to
be delivered but throw away siginfo data if memory is not enough.
SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
not be caught or masked.
The sigqueue() syscall allows user code to queue a signal to target
process, if resource is unavailable, EAGAIN will be returned as
specification said.
Just before thread exits, signal queue memory will be freed by
sigqueue_flush.
Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64


# 8347dc9b 29-May-2005 Robert Watson <rwatson@FreeBSD.org>

Add a new field, sy_auevent, to the system call entry description
structure, sysent. This field will hold the default audit event
to generate when the system call is entered. Currently, it will
default to 0 due to allocation in bss.

Submitted by: wsalamon
Obtained from: TrustedBSD Project


# 82c6e879 06-Apr-2004 Warner Losh <imp@FreeBSD.org>

Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core


# c460ac3a 24-Sep-2003 Peter Wemm <peter@FreeBSD.org>

Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit
systems where the data/stack/etc limits are too big for a 32 bit process.

Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.

Supply an ia32_fixlimits function. Export the clip/default values to
sysctl under the compat.ia32 heirarchy.

Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max
value rather than the sysctl tweakable variable. This allows mmap to
place mappings at sensible locations when limits have been reduced.

Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same
method as mmap(0, ...) now does.

Note that we cannot remove all references to the sysctl tweakable
maxdsiz etc variables because /etc/login.conf specifies a datasize
of 'unlimited'. And that causes exec etc to fail since it can no
longer find space to mmap things.


# d1e405c5 13-Dec-2002 Alfred Perlstein <alfred@FreeBSD.org>

SCARGS removal take II.


# bc9e75d7 13-Dec-2002 Alfred Perlstein <alfred@FreeBSD.org>

Backout removal SCARGS, the code freeze is only "selectively" over.


# 0bbe7292 13-Dec-2002 Alfred Perlstein <alfred@FreeBSD.org>

Remove SCARGS.

Reviewed by: md5


# f36ba452 01-Sep-2002 Jake Burkholder <jake@FreeBSD.org>

Added fields for VM_MIN_ADDRESS, PS_STRINGS and stack protections to
sysentvec. Initialized all fields of all sysentvecs, which will allow
them to be used instead of constants in more places. Provided stack
fixup routines for emulations that previously used the default.


# 3ebc1248 19-Jul-2002 Peter Wemm <peter@FreeBSD.org>

Infrastructure tweaks to allow having both an Elf32 and an Elf64 executable
handler in the kernel at the same time. Also, allow for the
exec_new_vmspace() code to build a different sized vmspace depending on
the executable environment. This is a big help for execing i386 binaries
on ia64. The ELF exec code grows the ability to map partial pages when
there is a page size difference, eg: emulating 4K pages on 8K or 16K
hardware pages.

Flesh out the i386 emulation support for ia64. At this point, the only
binary that I know of that fails is cvsup, because the cvsup runtime
tries to execute code in pages not marked executable.

Obtained from: dfr (mostly, many tweaks from me).


# c58eb46e 23-Mar-2002 Bruce Evans <bde@FreeBSD.org>

Fixed some style bugs in the removal of __P(()). The main ones were
not removing tabs before "__P((", and not outdenting continuation lines
to preserve non-KNF lining up of code with parentheses. Switch to KNF
formatting and/or rewrap the whole prototype in some cases.


# 789f12fe 19-Mar-2002 Alfred Perlstein <alfred@FreeBSD.org>

Remove __P


# 558626dc 18-Mar-2002 Alfred Perlstein <alfred@FreeBSD.org>

have the SYSCALL_MODULES macro provide an initializer for the 'old_sysent'
to avoid pedandic warnings.


# 21d56e9c 29-Dec-2001 Alfred Perlstein <alfred@FreeBSD.org>

Make AIO a loadable module.

Remove the explicit call to aio_proc_rundown() from exit1(), instead AIO
will use at_exit(9).

Add functions at_exec(9), rm_at_exec(9) which function nearly the
same as at_exec(9) and rm_at_exec(9), these functions are called
on behalf of modules at the time of execve(2) after the image
activator has run.

Use a modified version of tegge's suggestion via at_exec(9) to close
an exploitable race in AIO.

Fix SYSCALL_MODULE_HELPER such that it's archetecuterally neutral,
the problem was that one had to pass it a paramater indicating the
number of arguments which were actually the number of "int". Fix
it by using an inline version of the AS macro against the syscall
arguments. (AS should be available globally but we'll get to that
later.)

Add a primative system for dynamically adding kqueue ops, it's really
not as sophisticated as it should be, but I'll discuss with jlemon when
he's around.


# b40ce416 12-Sep-2001 Julian Elischer <julian@FreeBSD.org>

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 78525ce3 01-Dec-2000 Alfred Perlstein <alfred@FreeBSD.org>

sysvipc loadable.

new syscall entry lkmressys - "reserved loadable syscall"

Make syscall_register allow overwriting of such entries (lkmressys).


# 806d7daa 09-Nov-2000 Marcel Moolenaar <marcel@FreeBSD.org>

Make MINSIGSTKSZ machine dependent, and have the sigaltstack
syscall compare against a variable sv_minsigstksz in struct
sysentvec as to properly take the size of the machine- and
ABI dependent struct sigframe into account.

The SVR4 and iBCS2 modules continue to have a minsigstksz of
8192 to preserve behavior. The real values (if different) are
not known at this time. Other ABI modules use the real
values.

The native MINSIGSTKSZ is now defined as follows:

Arch MINSIGSTKSZ
---- -----------
alpha 4096
i386 2048
ia64 12288

Reviewed by: mjacob
Suggested by: bde


# 00910f28 05-Nov-2000 David E. O'Brien <obrien@FreeBSD.org>

ELF kernels should use an ELF sysvec. This allows us to move a.out
specific files to those platforms that acutally support a.out.


# d323ddf3 26-Apr-2000 Matthew Dillon <dillon@FreeBSD.org>

Fix #! script exec under linux emulation. If a script is exec'd from a
program running under linux emulation, the script binary is checked for
in /compat/linux first. Without this patch the wrong script binary
(i.e. the FreeBSD binary) will be run instead of the linux binary.
For example, #!/bin/sh, thus breaking out of linux compatibility mode.

This solves a number of problems people have had installing linux
software on FreeBSD boxes.


# 36e9f877 28-Mar-2000 Matthew Dillon <dillon@FreeBSD.org>

Commit major SMP cleanups and move the BGL (big giant lock) in the
syscall path inward. A system call may select whether it needs the MP
lock or not (the default being that it does need it).

A great deal of conditional SMP code for various deadended experiments
has been removed. 'cil' and 'cml' have been removed entirely, and the
locking around the cpl has been removed. The conditional
separately-locked fast-interrupt code has been removed, meaning that
interrupts must hold the CPL now (but they pretty much had to anyway).
Another reason for doing this is that the original separate-lock for
interrupts just doesn't apply to the interrupt thread mechanism being
contemplated.

Modifications to the cpl may now ONLY occur while holding the MP
lock. For example, if an otherwise MP safe syscall needs to mess with
the cpl, it must hold the MP lock for the duration and must (as usual)
save/restore the cpl in a nested fashion.

This is precursor work for the real meat coming later: avoiding having
to hold the MP lock for common syscalls and I/O's and interrupt threads.
It is expected that the spl mechanisms and new interrupt threading
mechanisms will be able to run in tandem, allowing a slow piecemeal
transition to occur.

This patch should result in a moderate performance improvement due to
the considerable amount of code that has been removed from the critical
path, especially the simplification of the spl*() calls. The real
performance gains will come later.

Approved by: jkh
Reviewed by: current, bde (exception.s)
Some work taken from: luoqi's patch


# 664a31e4 28-Dec-1999 Peter Wemm <peter@FreeBSD.org>

Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.


# 8e95fb76 27-Dec-1999 Bruce Evans <bde@FreeBSD.org>

Fixed namespace pollution in rev.1.24 (don't implement <sys/signal.h> here).
Fixed long lines.


# e3bdd871 27-Dec-1999 Bruce Evans <bde@FreeBSD.org>

This should have been committed with related changes to .c files.

Changed the type used to represent the user stack pointer from `long *'
to `register_t *'. This fixes bugs like misplacement of argc and argv
on the user stack on i386's with 64-bit longs. We still use longs to
represent "words" like argc and argv, and assume that they are on the
stack (and that there is stack). The suword() and fuword() families
should also use register_t.


# 2c42a146 29-Sep-1999 Marcel Moolenaar <marcel@FreeBSD.org>

sigset_t change (part 2 of 5)
-----------------------------

The core of the signalling code has been rewritten to operate
on the new sigset_t. No methodological changes have been made.
Most references to a sigset_t object are through macros (see
signalvar.h) to create a level of abstraction and to provide
a basis for further improvements.

The NSIG constant has not been changed to reflect the maximum
number of signals possible. The reason is that it breaks
programs (especially shells) which assume that all signals
have a non-null name in sys_signame. See src/bin/sh/trap.c
for an example. Instead _SIG_MAXSIG has been introduced to
hold the maximum signal possible with the new sigset_t.

struct sigprop has been moved from signalvar.h to kern_sig.c
because a) it is only used there, and b) access must be done
though function sigprop(). The latter because the table doesn't
holds properties for all signals, but only for the first NSIG
signals.

signal.h has been reorganized to make reading easier and to
add the new and/or modified structures. The "old" structures
are moved to signalvar.h to prevent namespace polution.

Especially the coda filesystem suffers from the change, because
it contained lines like (p->p_sigmask == SIGIO), which is easy
to do for integral types, but not for compound types.

NOTE: kdump (and port linux_kdump) must be recompiled.

Thanks to Garrett Wollman and Daniel Eischen for pressing the
importance of changing sigreturn as well.


# 5772c055 02-Sep-1999 Marcel Moolenaar <marcel@FreeBSD.org>

Silence warnings about the use of vnode sanse declaration.


# fca666a1 31-Aug-1999 Julian Elischer <julian@FreeBSD.org>

General cleanup of core-dumping code.

Submitted by: Sean Fagan,


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# 4c3df794 09-Jan-1999 Doug Rabson <dfr@FreeBSD.org>

Implement support for adding syscalls in KLD modules.

Submitted by: Assar Westerlund <assar@sics.se>


# 22d4b0fb 13-Sep-1998 John Polstra <jdp@FreeBSD.org>

Add provisions for variant core dump file formats, depending on the
object format of the executable being dumped. This is the first
step toward producing ELF core dumps in the proper format. I will
commit the code to generate the ELF core dumps Real Soon Now. In
the meantime, ELF executables won't dump core at all. That is
probably no less useful than dumping a.out-style core dumps as they
have done until now.

Submitted by: Alex <garbanzo@hooked.net> (with very minor changes by me)


# ecbb00a2 07-Jun-1998 Doug Rabson <dfr@FreeBSD.org>

This commit fixes various 64bit portability problems required for
FreeBSD/alpha. The most significant item is to change the command
argument to ioctl functions from int to u_long. This change brings us
inline with various other BSD versions. Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.


# 288078be 28-Apr-1998 Eivind Eklund <eivind@FreeBSD.org>

Translate T_PROTFLT to SIGSEGV instead of SIGBUS when running under
Linux emulation. This make Allegro Common Lisp 4.3 work under
FreeBSD!

Submitted by: Fred Gilham <gilham@csl.sri.com>
Commented on by: bde, dg, msmith, tg
Hoping he got everything right: eivind


# 9cf2c3e7 03-Feb-1998 Bruce Evans <bde@FreeBSD.org>

Forward declare some structs so that this file is more self-sufficient.


# cb226aaa 06-Nov-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 996c772f 09-Feb-1997 John Dyson <dyson@FreeBSD.org>

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 484141f6 25-Oct-1996 Bruce Evans <bde@FreeBSD.org>

Declare pointers to signal handling functions in full instead of as
sig_t's so that <sys/signal.h> isn't a prerequisite.


# 6ead3edd 17-Jun-1996 John Dyson <dyson@FreeBSD.org>

Clean-up the new VM map procfs code, and also add support for executable
format file "etype". It contains a description of the binary type for
a process.


# 04a49797 03-Apr-1996 Sujal Patel <smpatel@FreeBSD.org>

Fixed a typo in the comment for sv_errsize.


# 7a416e2d 29-Mar-1996 Bruce Evans <bde@FreeBSD.org>

Fixed the type of sv_sendsig. The `code' arg to signal handlers is now
u_long.


# d66a5066 02-Mar-1996 Peter Wemm <peter@FreeBSD.org>

Mega-commit for Linux emulator update.. This has been stress tested under
netscape-2.0 for Linux running all the Java stuff. The scrollbars are now
working, at least on my machine. (whew! :-)

I'm uncomfortable with the size of this commit, but it's too
inter-dependant to easily seperate out.

The main changes:

COMPAT_LINUX is *GONE*. Most of the code has been moved out of the i386
machine dependent section into the linux emulator itself. The int 0x80
syscall code was almost identical to the lcall 7,0 code and a minor tweak
allows them to both be used with the same C code. All kernels can now
just modload the lkm and it'll DTRT without having to rebuild the kernel
first. Like IBCS2, you can statically compile it in with "options LINUX".

A pile of new syscalls implemented, including getdents(), llseek(),
readv(), writev(), msync(), personality(). The Linux-ELF libraries want
to use some of these.

linux_select() now obeys Linux semantics, ie: returns the time remaining
of the timeout value rather than leaving it the original value.

Quite a few bugs removed, including incorrect arguments being used in
syscalls.. eg: mixups between passing the sigset as an int, vs passing
it as a pointer and doing a copyin(), missing return values, unhandled
cases, SIOC* ioctls, etc.

The build for the code has changed. i386/conf/files now knows how
to build linux_genassym and generate linux_assym.h on the fly.

Supporting changes elsewhere in the kernel:

The user-mode signal trampoline has moved from the U area to immediately
below the top of the stack (below PS_STRINGS). This allows the different
binary emulations to have their own signal trampoline code (which gets rid
of the hardwired syscall 103 (sigreturn on BSD, syslog on Linux)) and so
that the emulator can provide the exact "struct sigcontext *" argument to
the program's signal handlers.

The sigstack's "ss_flags" now uses SS_DISABLE and SS_ONSTACK flags, which
have the same values as the re-used SA_DISABLE and SA_ONSTACK which are
intended for sigaction only. This enables the support of a SA_RESETHAND
flag to sigaction to implement the gross SYSV and Linux SA_ONESHOT signal
semantics where the signal handler is reset when it's triggered.

makesyscalls.sh no longer appends the struct sysentvec on the end of the
generated init_sysent.c code. It's a lot saner to have it in a seperate
file rather than trying to update the structure inside the awk script. :-)

At exec time, the dozen bytes or so of signal trampoline code are copied
to the top of the user's stack, rather than obtaining the trampoline code
the old way by getting a clone of the parent's user area. This allows
Linux and native binaries to freely exec each other without getting
trampolines mixed up.


# 512fef80 20-Nov-1995 Bruce Evans <bde@FreeBSD.org>

Completed function declarations and/or added prototypes.


# 3cb43dbd 19-Sep-1995 Bruce Evans <bde@FreeBSD.org>

Generate prototypes for syscall-implementing functions. Put them in
<sys/sysproto.h> and use them (so far only) in kern/init_sysent.c.

Don't put $Id in generated files.

kern/syscalls.master:
I had to add some new fields to describe some non-orthogonal names.
E.g., the args struct for the syscall-implementing function foo()
is usually named `foo_args', but for getpid() it is named `args'.

sys/sysent.h:
sy_call_t is still incomplete to hide a couple of warnings.


# b5e8ce9f 16-Mar-1995 Bruce Evans <bde@FreeBSD.org>

Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'. Fix all the bugs found. There were no serious
ones.


# 1e1e0b44 14-Feb-1995 Søren Schmidt <sos@FreeBSD.org>

First attempt to run linux binaries. This is only the changes needed to
the generic kernel. The actual emulator is a separate LKM. (not finished
yet, sorry).
Submitted by: sos@freebsd.org & sef@kithrup.com


# 97f8109e 09-Oct-1994 Søren Schmidt <sos@FreeBSD.org>

Added errno conversion table for ibcs2 support.


# ca70a975 24-Aug-1994 Søren Schmidt <sos@FreeBSD.org>

New file declaring the sysent structures
Reviewed by:
Submitted by: