History log of /freebsd-current/sys/sys/resourcevar.h
Revision Date Author Comments
# 29363fb4 23-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by: Netflix


# 2ff63af9 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .h pattern

Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/


# bbe62559 19-Aug-2022 Mateusz Guzik <mjg@FreeBSD.org>

rlimit: line up with other clean up in thread_reap_domain

NFC


# 8a0cb04d 01-Feb-2022 Mateusz Guzik <mjg@FreeBSD.org>

Add lim_cowsync, similar to crcowsync


# dde5c031 29-May-2021 Mateusz Guzik <mjg@FreeBSD.org>

Fix up macro use in lim_cur


# fb8ab680 14-Nov-2020 Mateusz Guzik <mjg@FreeBSD.org>

thread: batch resource limit free calls


# 109b537c 25-Jul-2020 Mateusz Guzik <mjg@FreeBSD.org>

Remove leftover macros for long gone vmsize mtx


# 61a74c5c 15-Dec-2019 Jeff Roberson <jeff@FreeBSD.org>

schedlock 1/4

Eliminate recursion from most thread_lock consumers. Return from
sched_add() without the thread_lock held. This eliminates unnecessary
atomics and lock word loads as well as reducing the hold time for
scheduler locks. This will eventually allow for lockless remote adds.

Discussed with: kib
Reviewed by: jhb
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22626


# 73e62bc9 10-Dec-2018 Mateusz Guzik <mjg@FreeBSD.org>

Make lim_cur inline if possible.

It is a function call only to accomodate *some* ABIs which install a hook.
They only care for 3 types of limits: DATA, STACK, VMEM

Instead of always calling the func, see at compilation time if the requested
limit is something else and just do the read if so.

Sponsored by: The FreeBSD Foundation


# e8bb589d 04-Oct-2018 Matt Macy <mmacy@FreeBSD.org>

eliminate locking surrounding ui_vmsize and swap reserve by using atomics

Change swap_reserve and swap_total to be in units of pages so that
swap reservations can be done using only atomics instead of using a single
global mutex for swap_reserve and a single mutex for all processes running
under the same uid for uid accounting.

Results in mmap speed up and a 70% increase in brk calls / second.

Reviewed by: alc@, markj@, kib@
Approved by: re (delphij@)
Differential Revision: https://reviews.freebsd.org/D16273


# 9c11d8d4 17-Apr-2018 Brooks Davis <brooks@FreeBSD.org>

Remove the unused fuwintr() and suiwintr() functions.

Half of implementations always failed (returned (-1)) and they were
previously used in only one place.

Reviewed by: kib, andrew
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15102


# 51369649 20-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.


# fbbd9655 28-Feb-2017 Warner Losh <imp@FreeBSD.org>

Renumber copyright clause 4

Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96


# 1bdbd705 28-Feb-2016 Konstantin Belousov <kib@FreeBSD.org>

Implement process-shared locks support for libthr.so.3, without
breaking the ABI. Special value is stored in the lock pointer to
indicate shared lock, and offline page in the shared memory is
allocated to store the actual lock.

Reviewed by: vangyzen (previous version)
Discussed with: deischen, emaste, jhb, rwatson,
Martin Simmons <martin@lispworks.com>
Tested by: pho
Sponsored by: The FreeBSD Foundation


# a63513d7 14-Nov-2015 Edward Tomasz Napierala <trasz@FreeBSD.org>

Doh, commit in a wrong directory. Fix r290857.

MFC after: 1 month
Sponsored by: The FreeBSD Foundation


# cd672ca6 16-Jul-2015 Mateusz Guzik <mjg@FreeBSD.org>

Get rid of lim_update_thread and cred_update_thread.

Their primary use was in thread_cow_update to free up old resources.
Freeing had to be done with proc lock held and _cow_ funcs already knew
how to free old structs.


# 94edbbb0 24-Jun-2015 Mateusz Guzik <mjg@FreeBSD.org>

rlimit: fix a an old name in a comment: uihashtbl_mtx -> uihashtbl_lock


# f6f6d240 10-Jun-2015 Mateusz Guzik <mjg@FreeBSD.org>

Implement lockless resource limits.

Use the same scheme implemented to manage credentials.

Code needing to look at process's credentials (as opposed to thred's) is
provided with *_proc variants of relevant functions.

Places which possibly had to take the proc lock anyway still use the proc
pointer to access limits.


# 5c7bebf9 26-Nov-2014 Konstantin Belousov <kib@FreeBSD.org>

The process spin lock currently has the following distinct uses:

- Threads lifetime cycle, in particular, counting of the threads in
the process, and interlocking with process mutex and thread lock.
The main reason of this is that turnstile locks are after thread
locks, so you e.g. cannot unlock blockable mutex (think process
mutex) while owning thread lock.

- Virtual and profiling itimers, since the timers activation is done
from the clock interrupt context. Replace the p_slock by p_itimmtx
and PROC_ITIMLOCK().

- Profiling code (profil(2)), for similar reason. Replace the p_slock
by p_profmtx and PROC_PROFLOCK().

- Resource usage accounting. Need for the spinlock there is subtle,
my understanding is that spinlock blocks context switching for the
current thread, which prevents td_runtime and similar fields from
changing (updates are done at the mi_switch()). Replace the p_slock
by p_statmtx and PROC_STATLOCK().

The split is done mostly for code clarity, and should not affect
scalability.

Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# dff9862c 23-Nov-2014 Mateusz Guzik <mjg@FreeBSD.org>

ifdef RACCT ui_racct_foreach and struct uidinfo's ui_racct

Change racct_ create and destroy to macros evaluating to nothing without RACCT
so that their callers passing ui_racct don't have to be ifdefed.


# 9110db81 21-Oct-2013 Konstantin Belousov <kib@FreeBSD.org>

Add a resource limit for the total number of kqueues available to the
user. Kqueue now saves the ucred of the allocating thread, to
correctly decrement the counter on close.

Under some specific and not real-world use scenario for kqueue, it is
possible for the kqueues to consume memory proportional to the square
of the number of the filedescriptors available to the process. Limit
allows administrator to prevent the abuse.

This is kernel-mode side of the change, with the user-mode enabling
commit following.

Reported and tested by: pho
Discussed with: jmg
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks


# 8854fe39 22-Jan-2012 Mikolaj Golub <trociny@FreeBSD.org>

Change kern.proc.rlimit sysctl to:

- retrive only one, specified limit for a process, not the whole
array, as it was previously (the sysctl has been added recently and
has not been backported to stable yet, so this change is ok);

- allow to set a resource limit for another process.

Submitted by: Andrey Zonov <andrey at zonov.org>
Discussed with: kib
Reviewed by: kib
MFC after: 2 weeks


# 2417d97e 18-Jul-2011 John Baldwin <jhb@FreeBSD.org>

- Export each thread's individual resource usage in in struct kinfo_proc's
ki_rusage member when KERN_PROC_INC_THREAD is passed to one of the
process sysctls.
- Correctly account for the current thread's cputime in the thread when
doing the runtime fixup in calcru().
- Use TIDs as the key to lookup the previous thread to compute IO stat
deltas in IO mode in top when thread display is enabled.

Reviewed by: kib
Approved by: re (kib)


# 097055e2 29-Mar-2011 Edward Tomasz Napierala <trasz@FreeBSD.org>

Add racct. It's an API to keep per-process, per-jail, per-loginclass
and per-loginclass resource accounting information, to be used by the new
resource limits code. It's connected to the build, but the code that
actually calls the new functions will come later.

Sponsored by: The FreeBSD Foundation
Reviewed by: kib (earlier version)


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 1a996ed1 18-Jul-2010 Edward Tomasz Napierala <trasz@FreeBSD.org>

Revert r210225 - turns out I was wrong; the "/*-" is not license-only
thing; it's also used to indicate that the comment should not be automatically
rewrapped.

Explained by: cperciva@


# 805cc58a 18-Jul-2010 Edward Tomasz Napierala <trasz@FreeBSD.org>

The "/*-" comment marker is supposed to denote copyrights. Remove non-copyright
occurences from sys/sys/ and sys/kern/.


# eea4ac8b 18-Jul-2010 Edward Tomasz Napierala <trasz@FreeBSD.org>

Remove outdated comment and move part of it into more applicable place.


# f3e1e28b 26-May-2010 Konstantin Belousov <kib@FreeBSD.org>

MFC r208488:
Fix the double counting of the last process thread td_incruntime
on exit, that is done once in thread_exit() and the second time in
proc_reap(), by clearing td_incruntime.

Approved by: re (kensmith)


# 41fd9c63 24-May-2010 Konstantin Belousov <kib@FreeBSD.org>

Fix the double counting of the last process thread td_incruntime
on exit, that is done once in thread_exit() and the second time in
proc_reap(), by clearing td_incruntime.

Use the opportunity to revert to the pre-RUSAGE_THREAD exporting of ruxagg()
instead of ruxagg_locked() and use it from thread_exit().

Diagnosed and tested by: neel
MFC after: 3 days


# c193de56 11-May-2010 Konstantin Belousov <kib@FreeBSD.org>

MFC r207468:
Extract thread_lock()/ruxagg()/thread_unlock() fragment into utility
function ruxagg_tlock().
Convert the definition of kern_getrusage() to ANSI C.

MFC r207602:
Implement RUSAGE_THREAD. Add td_rux to keep extended runtime and ticks
information for thread to allow calcru1() (re)use.

Rename ruxagg()->ruxagg_locked(), ruxagg_tlock()->ruxagg() [1].
The ruxagg_locked() function no longer clears thread ticks nor
td_incruntime.

Not an MFC: the td_rux is added to the end of struct thread to keep
the KBI. Explicit bzero() of td_rux is added to new thread initialization
points.


# bed4c524 03-May-2010 Konstantin Belousov <kib@FreeBSD.org>

Implement RUSAGE_THREAD. Add td_rux to keep extended runtime and ticks
information for thread to allow calcru1() (re)use.

Rename ruxagg()->ruxagg_locked(), ruxagg_tlock()->ruxagg() [1].
The ruxagg_locked() function no longer clears thread ticks nor
td_incruntime.

Requested by: attilio [1]
Discussed with: attilio, bde
Reviewed by: bde
Based on submission by: Alexander Krizhanovsky <ak natsys-lab com>
MFC after: 1 week
X-MFC-Note: td_rux shall be moved to the end of struct thread


# 3364c323 23-Jun-2009 Konstantin Belousov <kib@FreeBSD.org>

Implement global and per-uid accounting of the anonymous memory. Add
rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved
for the uid.

The accounting information (charge) is associated with either map entry,
or vm object backing the entry, assuming the object is the first one
in the shadow chain and entry does not require COW. Charge is moved
from entry to object on allocation of the object, e.g. during the mmap,
assuming the object is allocated, or on the first page fault on the
entry. It moves back to the entry on forks due to COW setup.

The per-entry granularity of accounting makes the charge process fair
for processes that change uid during lifetime, and decrements charge
for proper uid when region is unmapped.

The interface of vm_pager_allocate(9) is extended by adding struct ucred *,
that is used to charge appropriate uid when allocation if performed by
kernel, e.g. md(4).

Several syscalls, among them is fork(2), may now return ENOMEM when
global or per-uid limits are enforced.

In collaboration with: pho
Reviewed by: alc
Approved by: re (kensmith)


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# bc093719 20-Aug-2008 Ed Schouten <ed@FreeBSD.org>

Integrate the new MPSAFE TTY layer to the FreeBSD operating system.

The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:

- Improved driver model:

The old TTY layer has a driver model that is not abstract enough to
make it friendly to use. A good example is the output path, where the
device drivers directly access the output buffers. This means that an
in-kernel PPP implementation must always convert network buffers into
TTY buffers.

If a PPP implementation would be built on top of the new TTY layer
(still needs a hooks layer, though), it would allow the PPP
implementation to directly hand the data to the TTY driver.

- Improved hotplugging:

With the old TTY layer, it isn't entirely safe to destroy TTY's from
the system. This implementation has a two-step destructing design,
where the driver first abandons the TTY. After all threads have left
the TTY, the TTY layer calls a routine in the driver, which can be
used to free resources (unit numbers, etc).

The pts(4) driver also implements this feature, which means
posix_openpt() will now return PTY's that are created on the fly.

- Improved performance:

One of the major improvements is the per-TTY mutex, which is expected
to improve scalability when compared to the old Giant locking.
Another change is the unbuffered copying to userspace, which is both
used on TTY device nodes and PTY masters.

Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.

Obtained from: //depot/projects/mpsafetty/...
Approved by: philip (ex-mentor)
Discussed: on the lists, at BSDCan, at the DevSummit
Sponsored by: Snow B.V., the Netherlands
dcons(4) fixed by: kan


# 1b072fbc 16-Mar-2008 Pawel Jakub Dawidek <pjd@FreeBSD.org>

- Use wait-free method to manage ui_sbsize and ui_proccnt fields in the
uidinfo structure. This entirely removes contention observed on the
ui_mtxp mutex (as it is now gone).
- Convert the uihashtbl_mtx mutex to a rwlock, as most of the time we just
need to read-lock it.

Reviewed by: jhb, jeff, kris & others
Tested by: kris


# a1fe14bc 09-Jun-2007 Attilio Rao <attilio@FreeBSD.org>

rufetch and calcru sometimes should be called atomically together.
This patch fixes places where they should be called atomically changing
their locking requirements (both assume per-proc spinlock held) and
introducing rufetchcalc which wrappers both calls to be performed in
atomic way.

Reviewed by: jeff
Approved by: jeff (mentor)


# 1b1618fb 04-Jun-2007 Jeff Roberson <jeff@FreeBSD.org>

- Change comments and asserts to reflect the removal of the global
scheduler lock.

Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)


# 1c4bcd05 31-May-2007 Jeff Roberson <jeff@FreeBSD.org>

- Move rusage from being per-process in struct pstats to per-thread in
td_ru. This removes the requirement for per-process synchronization in
statclock() and mi_switch(). This was previously supported by
sched_lock which is going away. All modifications to rusage are now
done in the context of the owning thread. reads proceed without locks.
- Aggregate exiting threads rusage in thread_exit() such that the exiting
thread's rusage is not lost.
- Provide a new routine, rufetch() to fetch an aggregate of all rusage
structures from all threads in a process. This routine must be used
in any place requiring a rusage from a process prior to it's exit. The
exited process's rusage is still available via p_ru.
- Aggregate tick statistics only on demand via rufetch() or when a thread
exits. Tick statistics are kept in the thread and protected by sched_lock
until it exits.

Initial patch by: attilio
Reviewed by: attilio, bde (some objections), arch (mostly silent)


# cb49fcd1 16-Dec-2005 John Baldwin <jhb@FreeBSD.org>

Change the addupc_*() functions to use the uintfptr_t type for pc rather
than uintptr_t as that is technically more correct.


# b2149bde 27-Sep-2005 John Baldwin <jhb@FreeBSD.org>

Use the reference count API to manage the reference counts for process
limit structures rather than using pool mutexes to protect the reference
counts.

Tested on: i386, alpha, sparc64


# 60727d8b 06-Jan-2005 Warner Losh <imp@FreeBSD.org>

/* -> /*- for license, minor formatting changes


# 78c85e8d 05-Oct-2004 John Baldwin <jhb@FreeBSD.org>

Rework how we store process times in the kernel such that we always store
the raw values including for child process statistics and only compute the
system and user timevals on demand.

- Fix the various kern_wait() syscall wrappers to only pass in a rusage
pointer if they are going to use the result.
- Add a kern_getrusage() function for the ABI syscalls to use so that they
don't have to play stackgap games to call getrusage().
- Fix the svr4_sys_times() syscall to just call calcru() to calculate the
times it needs rather than calling getrusage() twice with associated
stackgap, etc.
- Add a new rusage_ext structure to store raw time stats such as tick counts
for user, system, and interrupt time as well as a bintime of the total
runtime. A new p_rux field in struct proc replaces the same inline fields
from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux
field in struct proc contains the "raw" child time usage statistics.
ruadd() has been changed to handle adding the associated rusage_ext
structures as well as the values in rusage. Effectively, the values in
rusage_ext replace the ru_utime and ru_stime values in struct rusage. These
two fields in struct rusage are no longer used in the kernel.
- calcru() has been split into a static worker function calcru1() that
calculates appropriate timevals for user and system time as well as updating
the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a
copy of the process' p_rux structure to compute the timevals after updating
the runtime appropriately if any of the threads in that process are
currently executing. It also now only locks sched_lock internally while
doing the rux_runtime fixup. calcru() now only requires the caller to
hold the proc lock and calcru1() only requires the proc lock internally.
calcru() also no longer allows callers to ask for an interrupt timeval
since none of them actually did.
- calcru() now correctly handles threads executing on other CPUs.
- A new calccru() function computes the child system and user timevals by
calling calcru1() on p_crux. Note that this means that any code that wants
child times must now call this function rather than reading from p_cru
directly. This function also requires the proc lock.
- This finishes the locking for rusage and friends so some of the Giant locks
in exit1() and kern_wait() are now gone.
- The locking in ttyinfo() has been tweaked so that a shared lock of the
proctree lock is used to protect the process group rather than the process
group lock. By holding this lock until the end of the function we now
ensure that the process/thread that we pick to dump info about will no
longer vanish while we are trying to output its info to the console.

Submitted by: bde (mostly)
MFC after: 1 month


# e3a64610 04-Aug-2004 Robert Watson <rwatson@FreeBSD.org>

Annotate locking strategy for 'struct uidinfo'.


# 86db59f8 17-Jul-2004 Alfred Perlstein <alfred@FreeBSD.org>

Change named parameters from max (which conflicts with a macro in libkern.h)
to maxval.


# 52eb8464 16-Jul-2004 John Baldwin <jhb@FreeBSD.org>

- Move TDF_OWEPREEMPT, TDF_OWEUPC, and TDF_USTATCLOCK over to td_pflags
since they are only accessed by curthread and thus do not need any
locking.
- Move pr_addr and pr_ticks out of struct uprof (which is per-process)
and directly into struct thread as td_profil_addr and td_profil_ticks
as these variables are really per-thread. (They are used to defer an
addupc_intr() that was too "hard" until ast()).


# a3a70178 01-Jul-2004 John Baldwin <jhb@FreeBSD.org>

Tidy up uprof locking. Mostly the fields are protected by both the proc
lock and sched_lock so they can be read with either lock held. Document
the locking as well. The one remaining bogosity is that pr_addr and
pr_ticks should be per-thread but profiling of multithreaded apps is
currently undefined.


# 82c6e879 06-Apr-2004 Warner Losh <imp@FreeBSD.org>

Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core


# a875f385 06-Feb-2004 John Baldwin <jhb@FreeBSD.org>

- Convert the plimit lock to a pool mutex lock.
- Hide struct plimit from userland.

Submitted by: bde (2)


# 99b6e02b 06-Feb-2004 John Baldwin <jhb@FreeBSD.org>

A few more style fixes from Bruce including a few I missed last time.

Submitted by: bde


# b4323d77 05-Feb-2004 John Baldwin <jhb@FreeBSD.org>

- A lot of style and whitespace fixes.
- Update a few comments regarding locking notes.

Submitted by: bde (1, mostly)


# 91d5354a 04-Feb-2004 John Baldwin <jhb@FreeBSD.org>

Locking for the per-process resource limits structure.
- struct plimit includes a mutex to protect a reference count. The plimit
structure is treated similarly to struct ucred in that is is always copy
on write, so having a reference to a structure is sufficient to read from
it without needing a further lock.
- The proc lock protects the p_limit pointer and must be held while reading
limits from a process to keep the limit structure from changing out from
under you while reading from it.
- Various global limits that are ints are not protected by a lock since
int writes are atomic on all the archs we support and thus a lock
wouldn't buy us anything.
- All accesses to individual resource limits from a process are abstracted
behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return
either an rlimit, or the current or max individual limit of the specified
resource from a process.
- dosetrlimit() was renamed to kern_setrlimit() to match existing style of
other similar syscall helper functions.
- The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit()
(it didn't used the stackgap when it should have) but uses lim_rlimit()
and kern_setrlimit() instead.
- The svr4 compat no longer uses the stackgap for resource limits calls,
but uses lim_rlimit() and kern_setrlimit() instead.
- The ibcs2 compat no longer uses the stackgap for resource limits. It
also no longer uses the stackgap for accessing sysctl's for the
ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result,
ibcs2_sysconf() no longer needs Giant.
- The p_rlimit macro no longer exists.

Submitted by: mtm (mostly, I only did a few cleanups and catchups)
Tested on: i386
Compiled on: alpha, amd64


# 9665592e 28-Oct-2003 John Baldwin <jhb@FreeBSD.org>

According to the submitter, POSIX mandates that all interval timers are
reset in a child process after a fork(). Currently, however, only the
real timer is cleared while the virtual and profiling timers are inherited.

The realtimer is cleared because it lives directly in struct proc in
p_realtimer. It is in the zero'd section of struct proc. The other timers
live in the p_timer[] array in struct pstats. These timers are copied on
fork() rather than zero'd. The fix is to move p_timer[] to the zero'd
part of struct pstats so that they are zero'd instead of copied on fork().

Note: Since at least FreeBSD 2.0 (and possibly earlier) we've had storage
for two real interval timers. Now that the uarea is less important,
perhaps we could move all of p_timer[] over to struct proc and drop the
p_realtimer special case to fix that.

PR: kern/58647
Reported by: Dan Nelson <dnelson@allantgroup.com>
MFC after: 1 week


# e02fef7a 20-Apr-2003 Robert Watson <rwatson@FreeBSD.org>

Use u_int for the struct uidinfo reference count rather than u_short;
while >65534 references is unlikely, it is possible.

Reviewed by: tjr


# 27e39ae4 19-Feb-2003 Tim J. Robbins <tjr@FreeBSD.org>

Remove the PL_SHAREMOD flag from struct plimit, which could have been
used to share resource limits between rfork threads, but never was.
Removing it makes resource limit locking much simpler -- only the current
process can change the contents of the structure that p_limit points to.


# 4a338afd 17-Feb-2003 Julian Elischer <julian@FreeBSD.org>

Move a bunch of flags from the KSE to the thread.
I was in two minds as to where to put them in the first case..
I should have listenned to the other mind.

Submitted by: parts by davidxu@
Reviewed by: jeff@ mini@


# 6f8132a8 31-Jan-2003 Julian Elischer <julian@FreeBSD.org>

Reversion of commit by Davidxu plus fixes since applied.

I'm not convinced there is anything major wrong with the patch but
them's the rules..

I am using my "David's mentor" hat to revert this as he's
offline for a while.


# 0dbb100b 26-Jan-2003 David Xu <davidxu@FreeBSD.org>

Move UPCALL related data structure out of kse, introduce a new
data structure called kse_upcall to manage UPCALL. All KSE binding
and loaning code are gone.

A thread owns an upcall can collect all completed syscall contexts in
its ksegrp, turn itself into UPCALL mode, and takes those contexts back
to userland. Any thread without upcall structure has to export their
contexts and exit at user boundary.

Any thread running in user mode owns an upcall structure, when it enters
kernel, if the kse mailbox's current thread pointer is not NULL, then
when the thread is blocked in kernel, a new UPCALL thread is created and
the upcall structure is transfered to the new UPCALL thread. if the kse
mailbox's current thread pointer is NULL, then when a thread is blocked
in kernel, no UPCALL thread will be created.

Each upcall always has an owner thread. Userland can remove an upcall by
calling kse_exit, when all upcalls in ksegrp are removed, the group is
atomatically shutdown. An upcall owner thread also exits when process is
in exiting state. when an owner thread exits, the upcall it owns is also
removed.

KSE is a pure scheduler entity. it represents a virtual cpu. when a thread
is running, it always has a KSE associated with it. scheduler is free to
assign a KSE to thread according thread priority, if thread priority is changed,
KSE can be moved from one thread to another.

When a ksegrp is created, there is always N KSEs created in the group. the
N is the number of physical cpu in the current system. This makes it is
possible that even an userland UTS is single CPU safe, threads in kernel still
can execute on different cpu in parallel. Userland calls kse_create to add more
upcall structures into ksegrp to increase concurrent in userland itself, kernel
is not restricted by number of upcalls userland provides.

The code hasn't been tested under SMP by author due to lack of hardware.

Reviewed by: julian


# 5715307f 09-Oct-2002 John Baldwin <jhb@FreeBSD.org>

- Move p_cpulimit to struct proc from struct plimit and protect it with
sched_lock. This means that we no longer access p_limit in mi_switch()
and the p_limit pointer can be protected by the proc lock.
- Remove PRS_ZOMBIE check from CPU limit test in mi_switch(). PRS_ZOMBIE
processes don't call mi_switch(), and even if they did there is no longer
the danger of p_limit being NULL (which is what the original zombie check
was added for).
- When we bump the current processes soft CPU limit in ast(), just bump the
private p_cpulimit instead of the shared rlimit. This fixes an XXX for
some value of fix. There is still a (probably benign) bug in that this
code doesn't check that the new soft limit exceeds the hard limit.

Inspired by: bde (2)


# f4cd8f9f 30-Sep-2002 John Baldwin <jhb@FreeBSD.org>

Change p_cpulimit to be in seconds instead of microseconds. Since
p_runtime now is a bintime, it is no longer an optimization to store
p_cpulimit as microseconds.

Suggested by: phk


# f824b518 23-Jul-2002 John Polstra <jdp@FreeBSD.org>

Widen struct sockbuf's sb_timeo member to int from short. With
non-default but reasonable values of hz this member overflowed,
breaking NFS over UDP.

Also, as long as I'm plowing up struct sockbuf ... Change certain
members from u_long/long to u_int/int in order to reduce wasted
space on 64-bit machines. This change was requested by Andrew
Gallatin.

Netstat and systat need to be rebuilt. I am incrementing
__FreeBSD_version in case any ports need to change.


# 789f12fe 19-Mar-2002 Alfred Perlstein <alfred@FreeBSD.org>

Remove __P


# 012b7109 26-Feb-2002 Alfred Perlstein <alfred@FreeBSD.org>

remove trailing semi-colons from macro definitions.


# 547ce823 20-Jan-2002 Alfred Perlstein <alfred@FreeBSD.org>

use mutex pool mutexes for uidinfo locking.
replace mutex_lock calls on uidinfo with macro calls:
mtx_lock(&uidp->ui_mtx) -> UIDINFO_LOCK(uidp)

Terry Lambert <tlambert2@mindspring.com> helped with this.


# b40ce416 12-Sep-2001 Julian Elischer <julian@FreeBSD.org>

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 688ebe12 10-Aug-2001 John Baldwin <jhb@FreeBSD.org>

- Close races with signals and other AST's being triggered while we are in
the process of exiting the kernel. The ast() function now loops as long
as the PS_ASTPENDING or PS_NEEDRESCHED flags are set. It returns with
preemption disabled so that any further AST's that arrive via an
interrupt will be delayed until the low-level MD code returns to user
mode.
- Use u_int's to store the tick counts for profiling purposes so that we
do not need sched_lock just to read p_sticks. This also closes a
problem where the call to addupc_task() could screw up the arithmetic
due to non-atomic reads of p_sticks.
- Axe need_proftick(), aston(), astoff(), astpending(), need_resched(),
clear_resched(), and resched_wanted() in favor of direct bit operations
on p_sflag.
- Fix up locking with sched_lock some. In addupc_intr(), use sched_lock
to ensure pr_addr and pr_ticks are updated atomically with setting
PS_OWEUPC. In ast() we clear pr_ticks atomically with clearing
PS_OWEUPC. We also do not grab the lock just to test a flag.
- Simplify the handling of Giant in ast() slightly.

Reviewed by: bde (mostly)


# fb919e4d 01-May-2001 Mark Murray <markm@FreeBSD.org>

Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by: bde (with reservations)


# f34fa851 28-Mar-2001 John Baldwin <jhb@FreeBSD.org>

Catch up to header include changes:
- <sys/mutex.h> now requires <sys/systm.h>
- <sys/mutex.h> and <sys/sx.h> now require <sys/lock.h>


# 51c91299 22-Feb-2001 John Baldwin <jhb@FreeBSD.org>

Since the PC is a pointer to a code address, change the second parameter of
addupc_task() and addupc_intr() to be a uintptr_t instead of a u_long.


# 1baf4aab 30-Nov-2000 Alfred Perlstein <alfred@FreeBSD.org>

use a oppurtunistic locking strategy with the uidinfo structures to avoid
locking the global hash on each uifree()

make struct uidinfo only visible to the kernel

make uihold() a function rather than a macro to reduce bloat

swap the order of a spl/mutex to maintain consistancy


# 9c19bcdd 25-Nov-2000 Alfred Perlstein <alfred@FreeBSD.org>

Make uidinfo subsystem mpsafe
use a mutex lock when looking up/deleting entries on the hashlist
use a mutex lock on each uidinfo when updating fields

make uifree() a void function rather than 'int' since no one cares

allocate uidinfo structs with the M_ZERO flag and don't explicitly initialize
them

Assisted by: eivind, jhb, jakeb


# f535380c 05-Sep-2000 Don Lewis <truckman@FreeBSD.org>

Remove uidinfo hash table lookup and maintenance out of chgproccnt() and
chgsbsize(), which are called rather frequently and may be called from an
interrupt context in the case of chgsbsize(). Instead, do the hash table
lookup and maintenance when credentials are changed, which is a lot less
frequent. Add pointers to the uidinfo structures to the ucred and pcred
structures for fast access. Pass a pointer to the credential to chgproccnt()
and chgsbsize() instead of passing the uid. Add a reference count to the
uidinfo structure and use it to decide when to free the structure rather
than freeing the structure when the resource consumption drops to zero.
Move the resource tracking code from kern_proc.c to kern_resource.c. Move
some duplicate code sequences in kern_prot.c to separate helper functions.
Change KASSERTs in this code to unconditional tests and calls to panic().


# 8950d244 28-Jan-2000 Brian Feldman <green@FreeBSD.org>

Fix a bug that could crash the system if you press ^T while a slower
system is slowed down and in the right spot (a race condition in fork()).

The "previous time" fields have moved from pstat to proc. Anything which
uses KVM needs to be recompiled with a new libkvm/headers.

A couple wacky u_quad_t's in struct proc are now u_int64_t (the same, but
according to lack of 'quad's in proc.h and usage in kern_resource.c).
This will have no effect on code.

This has been make-world-and-installed-new-kernel-which-works-fine-tested.

Reviewed by: bde (previous version)


# 664a31e4 28-Dec-1999 Peter Wemm <peter@FreeBSD.org>

Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# 6fc8f347 13-Mar-1999 Bruce Evans <bde@FreeBSD.org>

Enforce monotonicity of apparent process user, system and interrupt times.

PR: 975, 10402


# e796e00d 28-May-1998 Poul-Henning Kamp <phk@FreeBSD.org>

Some cleanups related to timecounters and weird ifdefs in <sys/time.h>.

Clean up (or if antipodic: down) some of the msgbuf stuff.

Use an inline function rather than a macro for timecounter delta.

Maintain process "on-cpu" time as 64 bits of microseconds to avoid
needless second rollover overhead.

Avoid calling microuptime the second time in mi_switch() if we do
not pass through _idle in cpu_switch()

This should reduce our context-switch overhead a bit, in particular
on pre-P5 and SMP systems.

WARNING: Programs which muck about with struct proc in userland
will have to be fixed.

Reviewed, but found imperfect by: bde


# 08637435 28-Mar-1998 Bruce Evans <bde@FreeBSD.org>

Moved some #includes from <sys/param.h> nearer to where they are actually
used.


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 79df6d85 25-Jun-1996 Bruce Evans <bde@FreeBSD.org>

trap.c:
Fixed profiling of system times. It was pre-4.4Lite and didn't support
statclocks. System times were too small by a factor of 8.

Handle deferred profiling ticks the 4.4Lite way: use addupc_task() instead
of addupc(). Call addupc_task() directly instead of using the ADDUPC()
macro.

Removed vestigial support for PROFTIMER.

switch.s:
Removed addupc().

resourcevar.h:
Removed ADDUPC() and declarations of addupc().

cpu.h:
Updated a comment. i386's never were tahoe's, and the deferred profiling
tick became (possibly) multiple ticks in 4.4Lite.

Obtained from: mostly from NetBSD


# 02e2c406 11-Mar-1996 Peter Wemm <peter@FreeBSD.org>

Import 4.4BSD-Lite2 onto the vendor branch, note that in the kernel, all
files are off the vendor branch, so this should not change anything.

A "U" marker generally means that the file was not changed in between
the 4.4Lite and Lite-2 releases, and does not need a merge. "C" generally
means that there was a change.
[new sys/syscallargs.h file, to be "cvs rm"ed]


# 73e8f0c2 10-Mar-1996 Jeffrey Hsu <hsu@FreeBSD.org>

Merge in Lite2: cosmetic indentation change.
We already have the other Lite2 changes, which consists of additional
function prototypes.
Reviewed by: davidg & bde


# 080e34f3 14-Nov-1994 Bruce Evans <bde@FreeBSD.org>

Declare fuswintr() and suswintr() the same as fusword() and susword().
(These functions are implemented in assembler so the compiler can't
check the declarations.)

Clean up prototypes: restore CSRG's alphabetical order, arg names in
prototypes, formatting to fit in 80 columns


# 3d05297c 09-Oct-1994 Poul-Henning Kamp <phk@FreeBSD.org>

Cosmetics. (sort of) Added 19 prototypes.


# b4a8d575 08-Oct-1994 Poul-Henning Kamp <phk@FreeBSD.org>

Added prototypes here and there. Moved pfctlinput into socket.h.


# f86eaaca 02-Oct-1994 Poul-Henning Kamp <phk@FreeBSD.org>

Prototypes, prototypes and even more prototypes. Not quite done yet, but
getting closer all the time.


# 3c4dd356 02-Aug-1994 David Greenman <dg@FreeBSD.org>

Added $Id$


# df8bae1d 24-May-1994 Rodney W. Grimes <rgrimes@FreeBSD.org>

BSD 4.4 Lite Kernel Sources