272461 |
03-Oct-2014 |
gjb |
Copy stable/10@r272459 to releng/10.1 as part of the 10.1-RELEASE process.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
256281 |
10-Oct-2013 |
gjb |
Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
251423 |
05-Jun-2013 |
alc |
Relax the vm object locking. Use a read lock.
Sponsored by: EMC / Isilon Storage Division
|
248084 |
09-Mar-2013 |
attilio |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes.
The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs.
The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example).
Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
241896 |
22-Oct-2012 |
kib |
Remove the support for using non-mpsafe filesystem modules.
In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes.
Conducted and reviewed by: attilio Tested by: pho
|
232278 |
29-Feb-2012 |
mm |
Add procfs to jail-mountable filesystems.
Reviewed by: jamie MFC after: 1 week
|
230145 |
15-Jan-2012 |
trociny |
Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always want to read strings completely to know the actual size.
As a side effect it fixes the issue with kern.proc.args and kern.proc.env sysctls, which didn't return the size of available data when calling sysctl(3) with the NULL argument for oldp.
Note, in get_ps_strings(), which does actual work for proc_getargv() and proc_getenvv(), we still have a safety limit on the size of data read in case of a corrupted procces stack.
Suggested by: kib MFC after: 3 days
|
230132 |
15-Jan-2012 |
uqs |
Convert files to UTF-8
|
227834 |
22-Nov-2011 |
trociny |
In procfs_doproccmdline() if arguments are not cashed read them from the process stack.
Suggested by: kib Reviewed by: kib Tested by: pho MFC after: 2 weeks
|
227393 |
09-Nov-2011 |
kib |
Lock the thread lock around block that retrieves td_wmesg. Otherwise, procfs could see a thread with assigned td_wchan but still NULL td_wmesg.
Reported and tested by: pho MFC after: 1 week
|
227104 |
05-Nov-2011 |
kib |
Fix typo.
MFC after: 3 days
|
225617 |
16-Sep-2011 |
kmacy |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls.
Reviewed by: rwatson Approved by: re (bz)
|
224915 |
16-Aug-2011 |
kib |
Do not return success and a string "unknown" when vn_fullpath() was unable to resolve the path of the text vnode of the process. The behaviour is very confusing for any consumer of the procfs, in particular, java.
Reported and tested by: bf MFC after: 2 weeks Approved by: re (bz)
|
217896 |
26-Jan-2011 |
dchagin |
Add macro to test the sv_flags of any process. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures.
MFC after: 1 month
|
216128 |
02-Dec-2010 |
trasz |
Replace pointer to "struct uidinfo" with pointer to "struct ucred" in "struct vm_object". This is required to make it possible to account for per-jail swap usage.
Reviewed by: kib@ Tested by: pho@ Sponsored by: FreeBSD Foundation
|
216120 |
02-Dec-2010 |
kib |
For non-stopped threads, td_frame pointer is undefined. As a consequence, fill_regs() and fill_fpregs() access random data, usually on the thread kernel stack. Most often the td_frame points to the previous frame saved by last kernel entry sequence, but this is not guaranteed.
For /proc/<pid>/{regs,fpregs} read access, require the thread to be in stopped state. Otherwise, return EBUSY as is done for write case.
Reported and tested by: pho Approved by: des (procfs maintainer) MFC after: 1 week
|
209062 |
11-Jun-2010 |
avg |
fix a few cases where a string is passed via format argument instead of via %s
Most of the cases looked harmless, but this is done for the sake of correctness. In one case it even allowed to drop an intermediate buffer.
Found by: clang MFC after: 2 week
|
207848 |
10-May-2010 |
kib |
The thread_unsuspend() requires both process mutex and process spinlock locked. Postpone the process unlock till the thread_unsuspend() is called.
Approved by: des (procfs maintainer) MFC after: 1 week
|
207847 |
10-May-2010 |
kib |
For detach procfs ctl command, also clear P_STOPPED_TRACE process stop flag, and for each thread, TDB_SUSPEND debug flag, same as it is done by exit1() for orphaned debugee.
Approved by: des (procfs maintainer) MFC after: 1 week
|
205014 |
11-Mar-2010 |
nwhitehorn |
Provide groundwork for 32-bit binary compatibility on non-x86 platforms, for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32 option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts of the kernel and enhances the freebsd32 compatibility code to support big-endian platforms.
Reviewed by: kib, jhb
|
201954 |
09-Jan-2010 |
brooks |
Update the comment on printing group membership to reflect that fact that each groupt the process is a member of is printed rather than an entry for each group the user could be a member of.
MFC after: 3 days
|
197428 |
23-Sep-2009 |
kib |
Add per-process osrel node to the procfs, to allow read and set p_osrel value for the process.
Approved by: des (procfs maintainer) MFC after: 3 weeks
|
195840 |
24-Jul-2009 |
jhb |
Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver.
Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
|
194766 |
23-Jun-2009 |
kib |
Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid.
The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup.
The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped.
The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4).
Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced.
In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
|
192895 |
27-May-2009 |
jamie |
Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings.
Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge().
Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call.
Approved by: bz (mentor)
|
189282 |
02-Mar-2009 |
kib |
Use the p_sysent->sv_flags flag SV_ILP32 to detect 32bit process executing on 64bit kernel. This eliminates the direct comparisions of p_sysent with &ia32_freebsd_sysvec, that were left intact after r185169.
|
188677 |
16-Feb-2009 |
des |
Fix a logic bug that caused the pfs_attr method to be called only for PFS_PROCDEP nodes.
Submitted by: Andrew Brampton <brampton@gmail.com> MFC after: 2 weeks
|
186563 |
29-Dec-2008 |
kib |
vm_map_lock_read() does not increment map->timestamp, so we should compare map->timestamp with saved timestamp after map read lock is reacquired, not with saved timestamp + 1. The only consequence of the +1 was unconditional lookup of the next map entry, though.
Tested by: pho Approved by: des MFC after: 2 weeks
|
186562 |
29-Dec-2008 |
kib |
Use curproc->p_sysent->sv_flags bit SV_ILP32 for detection of the 32 bit caller, instead of direct comparision with ia32_freebsd_sysvec.
Tested by: pho Approved by: des MFC after: 2 weeks
|
185984 |
12-Dec-2008 |
kib |
Reference the vmspace of the process being inspected by procfs, linprocfs and sysctl kern_proc_vmmap handlers.
Reported and tested by: pho Reviewed by: rwatson, des MFC after: 1 week
|
185864 |
10-Dec-2008 |
kib |
Relock user map earlier, to have the lock held when break leaves the loop earlier due to sbuf error.
Pointy hat to: me Submitted by: dchagin
|
185766 |
08-Dec-2008 |
kib |
Make two style changes to create new commit and document proper commit message for r185765.
Noted by: rdivacky Requested by: des
Commit message for r185765 should be: In procfs map handler, and in linprocfs maps handler, do not call vn_fullpath() while having vm map locked. This is done in anticipation of the vop_vptocnp commit, that would make vn_fullpath sometime acquire vnode lock.
Also, in linprocfs, maps handler already acquires vnode lock.
No objections from: des MFC after: 2 week
|
185765 |
08-Dec-2008 |
kib |
Change the linprocfs <pid>/maps and procfs <pid>/map handlers to use sbuf instead of doing uiomove. This allows for reads from non-zero offsets to work.
Patch is forward-ported des@' one, and was adopted to current code by dchagin@ and me.
Reviewed by: des (linprocfs part) PR: kern/101453 MFC after: 1 week
|
184652 |
04-Nov-2008 |
jhb |
Remove unnecessary locking around vn_fullpath(). The vnode lock for the vnode in question does not need to be held. All the data structures used during the name lookup are protected by the global name cache lock. Instead, the caller merely needs to ensure a reference is held on the vnode (such as vhold()) to keep it from being freed.
In the case of procfs' <pid>/file entry, grab the process lock while we gain a new reference (via vhold()) on p_textvp to fully close races with execve(2).
For the kern.proc.vmmap sysctl handler, use a shared vnode lock around the call to VOP_GETATTR() rather than an exclusive lock.
MFC after: 1 month
|
183600 |
04-Oct-2008 |
kib |
Change the linprocfs <pid>/maps and procfs <pid>/map handlers to use sbuf instead of doing uiomove. This allows for reads from non-zero offsets to work.
Patch is forward-ported des@' one, and was adopted to current code by dchagin@ and me.
Reviewed by: des (linprocfs part) PR: kern/101453 MFC after: 1 week
|
177091 |
12-Mar-2008 |
jeff |
Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
|
175294 |
13-Jan-2008 |
attilio |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary.
KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed.
Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
175202 |
10-Jan-2008 |
attilio |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed.
Manpage and FreeBSD_version will be updated through further commits.
As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock.
Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
172207 |
17-Sep-2007 |
jeff |
- Move all of the PS_ flags into either p_flag or td_flags. - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM.
Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)
|
170587 |
12-Jun-2007 |
rwatson |
Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present.
Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c.
We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h.
Reviewed by: csjp Obtained from: TrustedBSD Project
|
170472 |
09-Jun-2007 |
attilio |
rufetch and calcru sometimes should be called atomically together. This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way.
Reviewed by: jeff Approved by: jeff (mentor)
|
170307 |
05-Jun-2007 |
jeff |
Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
169168 |
01-May-2007 |
des |
The process lock is held when procfs_ioctl() is called. Assert that this is so, and PHOLD the process while sleeping since msleep() will release the lock.
|
168968 |
23-Apr-2007 |
alc |
Add synchronization. Eliminate the acquisition and release of Giant.
Reviewed by: tegge
|
168763 |
15-Apr-2007 |
des |
Instead of stating GIANT_REQUIRED, just acquire and release Giant where needed. This does not make a difference now, but will when procfs is marked MPSAFE.
|
168759 |
15-Apr-2007 |
des |
Fix the same bug as in procfs_doproc{,db}regs(): check that uio_offset is 0 upon entry, and don't reset it before returning.
MFC after: 3 weeks
|
168758 |
15-Apr-2007 |
des |
Don't reset uio_offset to 0 before returning. Instead, refuse to service requests where uio_offset is not 0 to begin with. This fixes a long- standing bug where e.g. 'cat /proc/$$/regs' would loop forever.
MFC after: 3 weeks
|
167482 |
12-Mar-2007 |
des |
Add a pn_destroy field to pfs_node. This field points to a destructor function which is called from pfs_destroy() before the node is reclaimed.
Modify pfs_create_{dir,file,link}() to accept a pointer to a destructor function in addition to the usual attr / fill / vis pointers.
This breaks both the programming and binary interfaces between pseudofs and its consumers. It is believed that there are no pseudofs consumers outside the source tree, so that the impact of this change is minimal.
Submitted by: Aniruddha Bohra <bohra@cs.rutgers.edu>
|
166826 |
19-Feb-2007 |
rwatson |
Do allow PIOCSFL in jail for setguid processes; this is more consistent with other debugging checks elsewhere. XXX comment on the fact that p_candebug() is not being used here remains.
|
166548 |
07-Feb-2007 |
kib |
Fix the race of dereferencing /proc/<pid>/file with execve(2) by caching the value of p_textvp. This way, we always unlock the locked vnode. While there, vhold() the vnode around the vn_lock().
Reported and tested by: Guy Helmer (ghelmer palisadesys com) Approved by: des (procfs maintainer) MFC after: 1 week
|
164936 |
06-Dec-2006 |
julian |
Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it.
Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable.
The ULE scheduler compiles again but I have no idea if it works.
The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit.
Tested by David Xu, and Dan Eischen using libthr and libpthread.
|
164356 |
17-Nov-2006 |
kib |
Wake up PIOCWAIT handler on the process exit in addition to the stop events. &p->p_stype is explicitely woken up on process exit for us.
Now, truss /nonexistent exits with error instead of waiting until killed by signal.
Reported by: Nikos Vassiliadis nvass at teledomenet gr Reviewed by: jhb MFC after: 1 week
|
164033 |
06-Nov-2006 |
rwatson |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking.
Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
163709 |
26-Oct-2006 |
jb |
Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE).
Reviewed by: davidxu@
|
162711 |
27-Sep-2006 |
ru |
Fix our ioctl(2) implementation when the argument is "int". New ioctls passing integer arguments should use the _IOWINT() macro. This fixes a lot of ioctl's not working on sparc64, most notable being keyboard/syscons ioctls.
Full ABI compatibility is provided, with the bonus of fixing the handling of old ioctls on sparc64.
Reviewed by: bde (with contributions) Tested by: emax, marius MFC after: 1 week
|
159283 |
05-Jun-2006 |
ghelmer |
Upon further review, DES prefers this change over that in revision 1.13 to resolve the directory access problem for processes with P_SUGID flag set.
Suggested by: des
|
158880 |
24-May-2006 |
ghelmer |
Revision 1.4 set access for all sensitive files in /proc/<PID> to mode 0 if a process's uid or gid has changed, but the /proc/<PID> directory itself was also set to mode 0. Assuming this doesn't open any security holes, open access to the /proc/<PID> directory for users other than root to read or search the directory.
Reviewed by: des (back in February) MFC after: 3 weeks
|
155918 |
22-Feb-2006 |
jhb |
Hold the proc lock while calling proc_sstep() since the function asserts it and remove a PRELE() that didn't have a matching PHOLD(). The calling code already has a PHOLD anyway.
MFC after: 1 week
|
153706 |
24-Dec-2005 |
trhodes |
Make tv_sec a time_t on all platforms but alpha. Brings us more in line with POSIX. This also makes the struct correct we ever implement an i386-time64 architecture. Not that we need too.
Reviewed by: imp, brooks Approved by: njl (acpica), des (no objects, touches procfs) Tested with: make universe
|
151316 |
14-Oct-2005 |
davidxu |
1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code.
2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread.
3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc.
4. Add td_sigqueue to thread structure to hold all signals delivered to thread.
5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed.
6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals.
Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64
|
147692 |
30-Jun-2005 |
peter |
Jumbo-commit to enhance 32 bit application support on 64 bit kernels. This is good enough to be able to run a RELENG_4 gdb binary against a RELENG_4 application, along with various other tools (eg: 4.x gcore). We use this at work.
ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace, procfs and core dumps. procfs_*regs.c: vary the format of proc/XXX/*regs depending on the client and target application. procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their sscanf fails. They expect an unsigned long. imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps. sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note that 64 bit consumers can still debug 32 bit targets.
IA64 has got stubs for ia32_reg.c.
Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't implemented in the 32/64 wrapper yet. We also make a tiny patch to gdb pacify it over conflicting formats of ld-elf.so.1.
Approved by: re
|
147676 |
30-Jun-2005 |
peter |
Conditionally weaken sys_generic.c rev 1.136 to allow certain dubious ioctl numbers in backwards compatability mode. eg: an IOC_IN ioctl with a size of zero. Traditionally this was what you did before IOC_VOID existed, and we had some established users of this in the tree, namely procfs. Certain 3rd party drivers with binary userland components also have this too.
This is necessary to have 4.x and 5.x binaries use these ioctl's. We found this at work when trying to run 4.x binaries.
Approved by: re
|
143629 |
15-Mar-2005 |
phk |
Don't export major,minor, instead export tty name.
|
139776 |
06-Jan-2005 |
imp |
/* -> /*- for copyright notices, minor format tweaks as necessary
|
138281 |
01-Dec-2004 |
cperciva |
Fix unvalidated pointer dereference. This is FreeBSD-SA-04:17.procfs.
|
136152 |
05-Oct-2004 |
jhb |
Rework how we store process times in the kernel such that we always store the raw values including for child process statistics and only compute the system and user timevals on demand.
- Fix the various kern_wait() syscall wrappers to only pass in a rusage pointer if they are going to use the result. - Add a kern_getrusage() function for the ABI syscalls to use so that they don't have to play stackgap games to call getrusage(). - Fix the svr4_sys_times() syscall to just call calcru() to calculate the times it needs rather than calling getrusage() twice with associated stackgap, etc. - Add a new rusage_ext structure to store raw time stats such as tick counts for user, system, and interrupt time as well as a bintime of the total runtime. A new p_rux field in struct proc replaces the same inline fields from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux field in struct proc contains the "raw" child time usage statistics. ruadd() has been changed to handle adding the associated rusage_ext structures as well as the values in rusage. Effectively, the values in rusage_ext replace the ru_utime and ru_stime values in struct rusage. These two fields in struct rusage are no longer used in the kernel. - calcru() has been split into a static worker function calcru1() that calculates appropriate timevals for user and system time as well as updating the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a copy of the process' p_rux structure to compute the timevals after updating the runtime appropriately if any of the threads in that process are currently executing. It also now only locks sched_lock internally while doing the rux_runtime fixup. calcru() now only requires the caller to hold the proc lock and calcru1() only requires the proc lock internally. calcru() also no longer allows callers to ask for an interrupt timeval since none of them actually did. - calcru() now correctly handles threads executing on other CPUs. - A new calccru() function computes the child system and user timevals by calling calcru1() on p_crux. Note that this means that any code that wants child times must now call this function rather than reading from p_cru directly. This function also requires the proc lock. - This finishes the locking for rusage and friends so some of the Giant locks in exit1() and kern_wait() are now gone. - The locking in ttyinfo() has been tweaked so that a shared lock of the proctree lock is used to protect the process group rather than the process group lock. By holding this lock until the end of the function we now ensure that the process/thread that we pick to dump info about will no longer vanish while we are trying to output its info to the console.
Submitted by: bde (mostly) MFC after: 1 month
|
136004 |
01-Oct-2004 |
das |
Don't PHOLD() the target process in procfs, since this is already done in pseudofs. Moreover, PHOLD() may block between the p_candebug() access check and the actual operation.
|
128019 |
07-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson.
Approved by: core, peter, alc, rwatson
|
127694 |
01-Apr-2004 |
pjd |
Remove ps_argsopen from this check, because of two reasons: 1. This check if wrong, because it is true by default (kern.ps_argsopen is 1 by default) (p_cansee() is not even checked). 2. Sysctl kern.ps_argsopen is going away.
|
125454 |
04-Feb-2004 |
jhb |
Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists.
Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
|
124219 |
07-Jan-2004 |
rwatson |
Lock p->p_textvp before calling vn_fullpath() on it. Note the potential lock order concern due to the vnode lock held simultaneously by the caller into procfs.
Reported by: kuriyama Approved by: des
|
123247 |
07-Dec-2003 |
des |
Minor whitespace and style issues.
|
121247 |
19-Oct-2003 |
mux |
Remove debug printf().
|
120665 |
02-Oct-2003 |
nectar |
Introduce a uiomove_frombuf helper routine that handles computing and validating the offset within a given memory buffer before handing the real work off to uiomove(9).
Use uiomove_frombuf in procfs to correct several issues with integer arithmetic that could result in underflows/overflows. As a side-effect, the code is significantly simplified.
Add additional sanity checks when computing a memory allocation size in pfs_read.
Submitted by: rwatson (original uiomove_frombuf -- bugs are mine :-) Reported by: Joost Pol <joost@pine.nl> (integer underflows/overflows)
|
120583 |
29-Sep-2003 |
rwatson |
Add a new column to the procfs map to hold the name of the mapped file for vnode mappings. Note that this uses vn_fullpath() and may be somewhat unreliable, although not too unreliable for shared libraries. For non-vnode mappings, just print "-" for the field.
Obtained from: TrustedBSD Projects Sponsored by: DARPA, AFRL, Network Associates Laboratories
|
118907 |
14-Aug-2003 |
rwatson |
Add p_candebug() check to access a process map file in procfs; limit access to map information for processes that you wouldn't otherwise have debug rights on.
Tested by: bms
|
116361 |
15-Jun-2003 |
davidxu |
Rename P_THREADED to P_SA. P_SA means a process is using scheduler activations.
|
114734 |
05-May-2003 |
rwatson |
Clean up proc locking in procfs: make sure the proc lock is held before entering sys_process.c debugging primitives, or we violate assertions. Also, be more careful about releasing the process lock around calls to uiomove() which may sleep waiting for paging machinations or related notions. We may want to defer the uiomove() in at least one case, but jhb will look into that at a later date.
Reported by: Philippe Charnier <charnier@xp11.frmug.org> Reviewed by: jhb
|
114434 |
01-May-2003 |
des |
Instead of recording the Unix time in a process when it starts, record the uptime. Where necessary, convert it back to Unix time by adding boottime to it. This fixes a potential problem in the accounting code, which would compute the elapsed time incorrectly if the Unix time was stepped during the lifetime of the process.
|
113867 |
22-Apr-2003 |
jhb |
- Always call faultin() in _PHOLD() if PS_INMEM is clear. This closes a race where a thread could assume that a process was swapped in by PHOLD() when it actually wasn't fully swapped in yet. - In faultin(), always msleep() if PS_SWAPPINGIN is set instead of doing this check after bumping p_lock in the PS_INMEM == 0 case. Also, sched_lock is only needed for setting and clearning swapping PS_* flags and the swap thread inhibitor. - Don't set and clear the thread swap inhibitor in the same loops as the pmap_swapin/out_thread() since we have to do it under sched_lock. Instead, mimic the treatment of the PS_INMEM flag and use separate loops to set the inhibitors when clearing PS_INMEM and clear the inhibitors when setting PS_INMEM. - swapout() now returns with the proc lock held as it holds the lock while adjusting the swapping-related PS_* flags so that the proc lock can be used to test those flags. - Only use the proc lock to check the swapping-related PS_* flags in several places. - faultin() no longer requires sched_lock to be held by callers. - Rename PS_SWAPPING to PS_SWAPPINGOUT to be less ambiguous now that we have PS_SWAPPINGIN.
|
113620 |
17-Apr-2003 |
jhb |
- Use a local variable to close a minor race when determining if the wmesg printed out needs a prefix such as when a thread is blocked on a lock. - Use another local variable to close another race for the td_wmesg and td_wchan members of struct thread.
|
113619 |
17-Apr-2003 |
jhb |
Protect p_flag with the proc lock. The sched_lock is not needed to turn off P_STOPPED_SIG in p_flag.
|
113618 |
17-Apr-2003 |
jhb |
- P_SHOULDSTOP just needs proc lock now, so don't acquire sched_lock unless it is needed. - Add a proc lock assertion.
|
113617 |
17-Apr-2003 |
jhb |
Add a proc lock assertion and move another assertion up to the top of the function.
|
111738 |
02-Mar-2003 |
des |
wakeup(9) and msleep(9) take void * arguments, not caddr_t.
|
111585 |
27-Feb-2003 |
julian |
Change the process flags P_KSES to be P_THREADED. This is just a cosmetic change but I've been meaning to do it for about a year.
|
105988 |
26-Oct-2002 |
rwatson |
Slightly change the semantics of vnode labels for MAC: rather than "refreshing" the label on the vnode before use, just get the label right from inception. For single-label file systems, set the label in the generic VFS getnewvnode() code; for multi-label file systems, leave the labeling up to the file system. With UFS1/2, this means reading the extended attribute during vfs_vget() as the inode is pulled off disk, rather than hitting the extended attributes frequently during operations later, improving performance. This also corrects sematics for shared vnode locks, which were not previously present in the system. This chances the cache coherrency properties WRT out-of-band access to label data, but in an acceptable form. With UFS1, there is a small race condition during automatic extended attribute start -- this is not present with UFS2, and occurs because EAs aren't available at vnode inception. We'll introduce a work around for this shortly.
Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
105560 |
20-Oct-2002 |
phk |
Remove even more '&' from pointers to functions.
Spotted by: FlexeLint
|
104306 |
01-Oct-2002 |
jmallett |
Back our kernel support for reliable signal queues.
Requested by: rwatson, phk, and many others
|
104233 |
30-Sep-2002 |
jmallett |
First half of implementation of ksiginfo, signal queues, and such. This gets signals operating based on a TailQ, and is good enough to run X11, GNOME, and do job control. There are some intricate parts which could be more refined to match the sigset_t versions, but those require further evaluation of directions in which our signal system can expand and contract to fit our needs.
After this has been in the tree for a while, I will make in kernel API changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo more robustly, such that we can actually pass information with our (queued) signals to the userland. That will also result in using a struct ksiginfo pointer, rather than a signal number, in a lot of kern_sig.c, to refer to an individual pending signal queue member, but right now there is no defined behaviour for such.
CODAFS is unfinished in this regard because the logic is unclear in some places.
Sponsored by: New Gold Technology Reviewed by: bde, tjr, jake [an older version, logic similar]
|
103767 |
21-Sep-2002 |
jake |
Use the fields in the sysentvec and in the vm map header in place of the constants VM_MIN_ADDRESS, VM_MAXUSER_ADDRESS, USRSTACK and PS_STRINGS. This is mainly so that they can be variable even for the native abi, based on different machine types. Get stack protections from the sysentvec too. This makes it trivial to map the stack non-executable for certain abis, on machines that support it.
|
103216 |
11-Sep-2002 |
julian |
Completely redo thread states.
Reviewed by: davidxu@freebsd.org
|
102950 |
05-Sep-2002 |
davidxu |
s/SGNL/SIG/ s/SNGL/SINGLE/ s/SNGLE/SINGLE/
Fix abbreviation for P_STOPPED_* etc flags, in original code they were inconsistent and difficult to distinguish between them.
Approved by: julian (mentor)
|
101901 |
15-Aug-2002 |
jake |
Fixed 64bit big endian bugs relating to abuse of ioctl argument passing. This makes truss work on sparc64.
|
101132 |
01-Aug-2002 |
rwatson |
Introduce support for Mandatory Access Control and extensible kernel access control.
Modify procfs so that (when mounted multilabel) it exports process MAC labels as the vnode labels of procfs vnodes associated with processes.
Approved by: des Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
100884 |
29-Jul-2002 |
julian |
Create a new thread state to describe threads that would be ready to run except for the fact tha they are presently swapped out. Also add a process flag to indicate that the process has started the struggle to swap back in. This will be needed for the case where multiple threads start the swapin action top a collision. Also add code to stop a process fropm being swapped out if one of the threads in this process is actually off running on another CPU.. that might hurt...
Submitted by: Seigo Tanimura <tanimura@r.dl.itc.u-tokyo.ac.jp>
|
99072 |
29-Jun-2002 |
julian |
Part 1 of KSE-III
The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
96886 |
19-May-2002 |
jhb |
Change p_can{debug,see,sched,signal}()'s first argument to be a thread pointer instead of a proc pointer and require the process pointed to by the second argument to be locked. We now use the thread ucred reference for the credential checks in p_can*() as a result. p_canfoo() should now no longer need Giant.
|
95210 |
21-Apr-2002 |
bde |
Include <sys/systm.h> for (at least) the definition of atomic functions which are sometimes used by the macros in <sys/mutex.h>; don't depend on not-quite-necessary namespace pollution in <sys/mutex.h>.
|
95090 |
20-Apr-2002 |
rwatson |
Spelling fix for comment.
|
94624 |
13-Apr-2002 |
jhb |
- Change procfs_control()'s first argument to be a thread pointer instead of a process pointer. - Move the p_candebug() at the start of procfs_control() a bit to make locking feasible. We still perform the access check before doing anything, we just now perform it after acquiring locks. - Don't lock the sched_lock for TRACE_WAIT_P() and when checking to see if p_stat is SSTOP. We lock the process while setting p_stat to SSTOP so locking the process is sufficient to do a read to see if p_stat is SSTOP or not.
|
94623 |
13-Apr-2002 |
jhb |
Lock the target process for p_candebug().
|
94622 |
13-Apr-2002 |
jhb |
Lock the target process in procfs_doproc*regs() for p_candebug and while reading/writing the registers.
|
94620 |
13-Apr-2002 |
jhb |
- p_cansee() needs the target process locked. - We need the proc lock held for more of procfs_doprocstatus().
|
93593 |
01-Apr-2002 |
jhb |
Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@
|
93393 |
29-Mar-2002 |
alfred |
Protect proc struct (p_args and p_comm) when doing procfs IO that pulls data from it.
Submitted by: Jonathan Mini <mini@haikugeek.com>
|
92727 |
19-Mar-2002 |
alfred |
Remove __P.
|
91140 |
23-Feb-2002 |
tanimura |
Lock struct pgrp, session and sigio.
New locks are:
- pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members.
Please refer to sys/proc.h for the coverage of these locks.
Changes on the pgrp/session interface:
- pgfind() needs the pgrpsess_lock held.
- The caller of enterpgrp() is responsible to allocate a new pgrp and session.
- Call enterthispgrp() in order to enter an existing pgrp.
- pgsignal() requires a pgrp lock held.
Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)
|
90873 |
18-Feb-2002 |
des |
Paranoia: if the process is setugid, set all sensitive files mode 0.
|
90717 |
16-Feb-2002 |
bde |
FIxed the following style bugs: - clobbering of jsp's $Id$ by FreeBSD's old $Id$. - long lines in recent KSE changes (procfs_ctl.c). - other style bugs in KSE changes (most related to an shadowed variable in procfs_status.c -- the td in the outer scope is obfuscated by PFS_FILL_ARGS).
Approved by: des
|
90716 |
16-Feb-2002 |
bde |
FIxed the following style bugs: - clobbering of jsp's $Id$ by FreeBSD's old $Id$. - lost Berkeley id in procfs_dbregs.c - long lines in recent KSE changes. - various gratuitous differences between procfs_*regs.c.
|
90715 |
16-Feb-2002 |
bde |
Fixed missing PHOLD()/PRELE().
Obtained from: procfs_dbregs.c Approved by: des
|
90361 |
07-Feb-2002 |
julian |
Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out.
Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
|
87669 |
11-Dec-2001 |
des |
Remove an obsolete prototype for procfs_kmemaccess().
Submitted by: rwatson
|
87542 |
09-Dec-2001 |
des |
Fix various bugs in the debugging code and reenable it.
|
87538 |
08-Dec-2001 |
des |
Fix a KSEfication brain-o in procfs_doprocfile(): return the path of the target process, not the calling process. While we're here, also unstaticize procfs_doprocfile() and procfs_docurproc() so linprocfs can call them directly instead of duplicating them.
Submitted by: Dominic Mitchell <dom@semantico.com>
|
87321 |
04-Dec-2001 |
des |
Pseudofsize procfs(5).
|
87275 |
03-Dec-2001 |
rwatson |
o Introduce pr_mtx into struct prison, providing protection for the mutable contents of struct prison (hostname, securelevel, refcount, pr_linux, ...) o Generally introduce mtx_lock()/mtx_unlock() calls throughout kern/ so as to enforce these protections, in particular, in kern_mib.c protection sysctl access to the hostname and securelevel, as well as kern_prot.c access to the securelevel for access control purposes. o Rewrite linux emulator abstractions for accessing per-jail linux mib entries (osname, osrelease, osversion) so that they don't return a pointer to the text in the struct linux_prison, rather, a copy to an array passed into the calls. Likewise, update linprocfs to use these primitives. o Update in_pcb.c to always use prison_getip() rather than directly accessing struct prison.
Reviewed by: jhb
|
86165 |
07-Nov-2001 |
peter |
Fix printf format bugs introduced in rev 1.34 for printing times. quad_t cannot be printed with %lld on 64 bit systems.
Dont waste cpu to round user and system times up to long long, it is highly improbable that a process will have accumulated 68 years of user or system cpu time (not wall clock time) before a reboot or process restart.
|
86136 |
06-Nov-2001 |
green |
Correctly unlock the target process if /proc/$foo/mem is open()ed by another process which cannot p_candebug() it. The bug was introduced in rev. 1.100.
Approved by: des
|
85644 |
28-Oct-2001 |
dillon |
Adjust printfs to be time_t agnostic.
|
85320 |
22-Oct-2001 |
des |
No, you may not /* FALLTHROUGH */. Not only will you return an incorrect result, but you'd corrupt the kernel malloc() arena if it weren't for a small but life-saving optimization in ioctl().
MFC after: 1 week
|
85297 |
21-Oct-2001 |
des |
Move procfs_* from procfs_machdep.c into sys_process.c, and rename them to proc_* in the process; procfs_machdep.c is no longer needed.
Run-tested on i386, build-tested on Alpha, untested on other platforms.
|
84637 |
07-Oct-2001 |
des |
Dissociate ptrace from procfs.
Until now, the ptrace syscall was implemented as a wrapper that called various functions in procfs depending on which ptrace operation was requested. Most of these functions were themselves wrappers around procfs_{read,write}_{,db,fp}regs(), with only some extra error checks, which weren't necessary in the ptrace case anyway.
This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c (renaming it to proc_rwmem() in the process), and implements ptrace() directly in terms of procfs_{read,write}_{,db,fp}regs() instead of having it fake up a struct uio and then call procfs_do{,db,fp}regs().
It also moves the prototypes for procfs_{read,write}_{,db,fp}regs() and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files except procfs_machdep.c as "optional procfs" instead of "standard".
|
84634 |
07-Oct-2001 |
des |
Remove some useless preprocesor paranoia.
|
84633 |
07-Oct-2001 |
des |
In procfs_readdir(), when the directory being read was a process directory, the target process was being held locked during the uiomove() call. If the process calling readdir() was the same as the target process (for instance 'ls /proc/curproc/'), and uiomove() caused a page fault, the result would be a proc lock recursion. I have no idea how long this has been broken - possibly ever since pfind() was changed to lock the process it returns.
Also replace the one and only call to procfs_findtextvp() with a direct test of td->td_proc->p_textvp.
|
83920 |
25-Sep-2001 |
mike |
A process name may contain whitespace and unprintable characters, so convert those characters to octal notation. Also convert backslashes to octal notation to avoid confusion.
Reviewed by: des MFC after: 1 week
|
83635 |
18-Sep-2001 |
rwatson |
o Remove redundant securelevel/pid1 check in procfs_rw() -- this protection is enforced at the invidual method layer using p_candebug().
Obtained from: TrustedBSD Project
|
83366 |
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
81112 |
03-Aug-2001 |
rwatson |
Remove dangling prototype for the now defunct procfs_kmemaccess() call.
Obtained from: TrustedBSD Project
|
81109 |
03-Aug-2001 |
rwatson |
Collapse a Pmem case in with the other debugging files case for procfs, as there are now "unusual" protection properties to Pmem that differ from the other files. While I'm at it, introduce proc locking for the other files, which was previously present only in the Pmem case.
Obtained from: TrustedBSD Project
|
81108 |
03-Aug-2001 |
rwatson |
Remove read permission for group on the /proc/*/mem file, since kmem no longer requires access.
Reviewed by: tmm Obtained from: TrustedBSD Project
|
81107 |
03-Aug-2001 |
rwatson |
Prior to support for almost all ps activity via sysctl, ps used procfs, and so special-casing was introduced to provide extra procfs privilege to the kmem group. With the advent of non-setgid kmem ps, this code is no longer required, and in fact, can is potentially harmful as it allocates privilege to a gid that is increasingly less meaningful. Knowledge of specific gid's in kernel is also generally bad precedent, as the kernel security policy doesn't distinguish gid's specifically, only uid 0.
This commit removes reference to kmem in procfs, both in terms of access control decisions, and the applying of gid kmem to the /proc/*/mem file, simplifying the associated code considerably. Processes are still permitted to access the mem file based on the debugging policy, so ps -e still works fine for normal processes and use.
Reviewed by: tmm Obtained from: TrustedBSD Project
|
79335 |
05-Jul-2001 |
rwatson |
o Replace calls to p_can(..., P_CAN_xxx) with calls to p_canxxx(). The p_can(...) construct was a premature (and, it turns out, awkward) abstraction. The individual calls to p_canxxx() better reflect differences between the inter-process authorization checks, such as differing checks based on the type of signal. This has a side effect of improving code readability. o Replace direct credential authorization checks in ktrace() with invocation of p_candebug(), while maintaining the special case check of KTR_ROOT. This allows ktrace() to "play more nicely" with new mandatory access control schemes, as well as making its authorization checks consistent with other "debugging class" checks. o Eliminate "privused" construct for p_can*() calls which allowed the caller to determine if privilege was required for successful evaluation of the access control check. This primitive is currently unused, and as such, serves only to complicate the API.
Approved by: ({procfs,linprocfs} changes) des Obtained from: TrustedBSD Project
|
79224 |
04-Jul-2001 |
dillon |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
77799 |
06-Jun-2001 |
tanimura |
Lock VM Giant prior to locking a vm map.
Spotted by: Daniel Rock <D.Rock@t-online.de> Tested by: David Wolfskill <david@catwhisker.org>, Sean Eric Fagan <sef@kithrup.com>
|
77183 |
25-May-2001 |
rwatson |
o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account.
Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit
|
77031 |
23-May-2001 |
ru |
- FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file systems were repo-copied from sys/miscfs to sys/fs.
- Renamed the following file systems and their modules: fdesc -> fdescfs, portal -> portalfs, union -> unionfs.
- Renamed corresponding kernel options: FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS.
- Install header files for the above file systems.
- Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland Makefiles.
|
76827 |
19-May-2001 |
alfred |
Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level vm operations.
faults can not be taken without holding Giant.
Memory subsystems can now call the base page allocators safely.
Almost all atomic ops were removed as they are covered under the vm mutex.
Alpha and ia64 now need to catch up to i386's trap handlers.
FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties).
Reviewed (partially) by: jake, jhb
|
76491 |
11-May-2001 |
jhb |
GC prototype for procfs_bmap() missed during a previous commit.
|
76166 |
01-May-2001 |
markm |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files.
Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files.
Sort sys/*.h includes where possible in affected files.
OK'ed by: bde (with reservations)
|
76131 |
29-Apr-2001 |
phk |
Add a vop_stdbmap(), and make it part of the default vop vector.
Make 7 filesystems which don't really know about VOP_BMAP rely on the default vector, rather than more or less complete local vop_nopbmap() implementations.
|
76117 |
29-Apr-2001 |
grog |
Revert consequences of changes to mount.h, part 2.
Requested by: bde
|
75893 |
24-Apr-2001 |
jhb |
Change the pfind() and zpfind() functions to lock the process that they find before releasing the allproc lock and returning.
Reviewed by: -smp, dfr, jake
|
75858 |
23-Apr-2001 |
grog |
Correct #includes to work with fixed sys/mount.h.
|
74996 |
29-Mar-2001 |
jhb |
- Various style fixes. - Fix a silly bug so that we return the actual error code if a procfs attach fails rather than always returning 0.
Reported by: bde
|
74927 |
28-Mar-2001 |
jhb |
Convert the allproc and proctree locks from lockmgr locks to sx locks.
|
74914 |
28-Mar-2001 |
jhb |
Catch up to header include changes: - <sys/mutex.h> now requires <sys/systm.h> - <sys/mutex.h> and <sys/sx.h> now require <sys/lock.h>
|
73920 |
07-Mar-2001 |
jhb |
Proc locking identical to that of linprocfs' vnops except that we hold the proc lock while calling psignal.
|
73919 |
07-Mar-2001 |
jhb |
Protect read to p_pptr with proc lock rather than proctree lock.
|
73918 |
07-Mar-2001 |
jhb |
Proc locking. Lock around psignal() and also ensure both an exclusive proctree lock and the process lock are held when updating p_pptr and p_oppid. When we are just reaading p_pptr we only need the proc lock and not a proctree lock as well.
|
73906 |
07-Mar-2001 |
jhb |
Protect p_flag with the proc lock.
|
73383 |
03-Mar-2001 |
dfr |
Remove the copyinstr call which was trying to copy the pathname in from user space. It has already been copied in and mp->mnt_stat.f_mntonname has already been initialised by the caller.
This fixes a panic on the alpha caused by the fact that the variable 'size' wasn't initialised because the call to copyinstr() bailed out with an EFAULT error.
|
72786 |
21-Feb-2001 |
rwatson |
o Move per-process jail pointer (p->pr_prison) to inside of the subject credential structure, ucred (cr->cr_prison). o Allow jail inheritence to be a function of credential inheritence. o Abstract prison structure reference counting behind pr_hold() and pr_free(), invoked by the similarly named credential reference management functions, removing this code from per-ABI fork/exit code. o Modify various jail() functions to use struct ucred arguments instead of struct proc arguments. o Introduce jailed() function to determine if a credential is jailed, rather than directly checking pointers all over the place. o Convert PRISON_CHECK() macro to prison_check() function. o Move jail() function prototypes to jail.h. o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the flag in the process flags field itself. o Eliminate that "const" qualifier from suser/p_can/etc to reflect mutex use.
Notes:
o Some further cleanup of the linux/jail code is still required. o It's now possible to consider resolving some of the process vs credential based permission checking confusion in the socket code. o Mutex protection of struct prison is still not present, and is required to protect the reference count plus some fields in the structure.
Reviewed by: freebsd-arch Obtained from: TrustedBSD Project
|
72200 |
09-Feb-2001 |
bmilekic |
Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case.
Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
|
71999 |
04-Feb-2001 |
phk |
Mechanical change to use <sys/queue.h> macro API instead of fondling implementation details.
Created with: sed(1) Reviewed by: md5(1)
|
71569 |
24-Jan-2001 |
jhb |
- Catch up to proc flag changes.
|
70536 |
31-Dec-2000 |
phk |
Use macro API to <sys/queue.h>
|
70317 |
23-Dec-2000 |
jake |
Protect proc.p_pptr and proc.p_children/p_sibling with the proctree_lock.
linprocfs not locked pending response from informal maintainer.
Reviewed by: jhb, -smp@
|
69958 |
13-Dec-2000 |
rwatson |
o Tighten restrictions on use of /proc/pid/ctl and move access checks in ctl to using centralized p_can() inter-process access control interface.
Reviewed by: sef
|
69947 |
13-Dec-2000 |
jake |
- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.
|
69798 |
09-Dec-2000 |
des |
Add a module version (so that linprocfs can properly depend on procfs)
|
69507 |
02-Dec-2000 |
jhb |
Protect p_stat with the sched_lock.
Reviewed by: jake
|
68505 |
08-Nov-2000 |
eivind |
More paranoia against overflows
|
68199 |
01-Nov-2000 |
eivind |
Fix overflow from jail hostname.
Bug found by: Esa Etelavuori <eetelavu@cc.hut.fi>
|
66701 |
05-Oct-2000 |
alfred |
return correct type for process directory entries, DT_DIR not DT_REG
|
65445 |
04-Sep-2000 |
des |
Remove a comment that has been not only obsolete but patently wrong for the last 31 revisions (almost three years).
|
65339 |
01-Sep-2000 |
rwatson |
o Simplify if/then clause equating ESRCH with ENOENT when hiding a process
Submitted by: des
|
65331 |
01-Sep-2000 |
rwatson |
o Make procfs use vaccess() for procfs_access() DAC and super-user checks, rather than implementing its own {uid,gid,other} checks against vnode mode. Similar change to linprocfs currently under review.
Obtained from: TrustedBSD Project
|
65237 |
30-Aug-2000 |
rwatson |
o Centralize inter-process access control, introducing:
int p_can(p1, p2, operation, privused)
which allows specification of subject process, object process, inter-process operation, and an optional call-by-reference privused flag, allowing the caller to determine if privilege was required for the call to succeed. This allows jail, kern.ps_showallprocs and regular credential-based interaction checks to occur in one block of code. Possible operations are P_CAN_SEE, P_CAN_SCHED, P_CAN_KILL, and P_CAN_DEBUG. p_can currently breaks out as a wrapper to a series of static function checks in kern_prot, which should not be invoked directly.
o Commented out capabilities entries are included for some checks.
o Update most inter-process authorization to make use of p_can() instead of manual checks, PRISON_CHECK(), P_TRESPASS(), and kern.ps_showallprocs.
o Modify suser{,_xxx} to use const arguments, as it no longer modifies process flags due to the disabling of ASU.
o Modify some checks/errors in procfs so that ENOENT is returned instead of ESRCH, further improving concealment of processes that should not be visible to other processes. Also introduce new access checks to improve hiding of processes for procfs_lookup(), procfs_getattr(), procfs_readdir(). Correct a bug reported by bp concerning not handling the CREATE case in procfs_lookup(). Remove volatile flag in procfs that caused apparently spurious qualifier warnigns (approved by bde).
o Add comment noting that ktrace() has not been updated, as its access control checks are different from ptrace(), whereas they should probably be the same. Further discussion should happen on this topic.
Reviewed by: bde, green, phk, freebsd-security, others Approved by: bde Obtained from: TrustedBSD Project
|
64819 |
18-Aug-2000 |
phk |
Introduce vop_stdinactive() and make it the default if no vop_inactive is declared.
Sort and prune a few vop_op[].
|
59760 |
29-Apr-2000 |
phk |
Remove unneeded #include <sys/kernel.h>
|
59652 |
26-Apr-2000 |
green |
Move procfs_fullpath() to vfs_cache.c, with a rename to textvp_fullpath(). There's no excuse to have code in synthetic filestores that allows direct references to the textvp anymore.
Feature requested by: msmith Feature agreed to by: warner Move requested by: phk Move agreed to by: bde
|
59522 |
22-Apr-2000 |
green |
Quiet an unused variable warning by commenting out a variable declaration that goes with a commented out statement.
|
59482 |
22-Apr-2000 |
green |
There's no reason to make "file" 0500 rather than 0555.
|
59481 |
22-Apr-2000 |
green |
Welcome back our old friend from procfs, "file"!
|
55206 |
29-Dec-1999 |
peter |
Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.
|
55153 |
27-Dec-1999 |
peter |
Fix typo "," vs ";"
PR: 15696 Submitted by: Takashi Okumura <taka@cs.pitt.edu>
|
54908 |
20-Dec-1999 |
eivind |
Include vm/vm_extern.h to get at prototypes
|
54803 |
19-Dec-1999 |
rwatson |
Second pass commit to introduce new ACL and Extended Attribute system calls, vnops, vfsops, both in /kern, and to individual file systems that require a vfsop_ array entry.
Reviewed by: eivind
|
54655 |
15-Dec-1999 |
eivind |
Introduce NDFREE (and remove VOP_ABORTOP)
|
54424 |
11-Dec-1999 |
peter |
Don't simulate a pseudo address-space beyond VM_MAXUSER_ADDRESS that maps onto the upages. We used to use this extensively, particularly for ps and gdb. Both of these have been "fixed". ps gets the p_stats via eproc along with all the other stats, and gdb uses the regs, fpregs etc files.
Once apon a time the UPAGES were mapped here, but that changed back in January '96. This essentially kills my revisions 1.16 and 1.17. The 2-page "hole" above the stack can be reclaimed now.
|
54292 |
08-Dec-1999 |
phk |
Remove unused #includes.
Obtained from: http://bogon.freebsd.dk/include
|
53709 |
26-Nov-1999 |
phk |
Add a sysctl to control if argv is disclosed to the world: kern.ps_argsopen It defaults to 1 which means that all users can see all argvs in ps(1).
Reviewed by: Warner
|
53518 |
21-Nov-1999 |
phk |
Introduce the new function p_trespass(struct proc *p1, struct proc *p2) which returns zero or an errno depending on the legality of p1 trespassing on p2.
Replace kern_sig.c:CANSIGNAL() with call to p_trespass() and one extra signal related check.
Replace procfs.h:CHECKIO() macros with calls to p_trespass().
Only show command lines to process which can trespass on the target process.
|
53503 |
21-Nov-1999 |
phk |
s/p_cred->pc_ucred/p_ucred/g
|
53467 |
20-Nov-1999 |
sef |
A process should be able to examine itself.
|
53301 |
17-Nov-1999 |
phk |
Make proc/*/cmdline use the cached argv if available.
Submitted by: Paul Saab <paul@mu.org> Reviewed by: phk
|
53300 |
17-Nov-1999 |
phk |
The function `procfs_getattr()' in procfs doesn't set the value of vap->va_fsid, so we cannot get valid information about procfs.
Submitted by: SAWADA Mizuki miz@pa.aix.or.jp Reviewed by: phk PR: 1654
|
53045 |
09-Nov-1999 |
alc |
Passing "0" or "FALSE" as the fourth argument to vm_fault is wrong. It should be "VM_FAULT_NORMAL".
|
52990 |
08-Nov-1999 |
sef |
Explain why Warner is right, and I am wrong, in the removing of the file object. Also explain some possible directions to re-implement it -- I'm not sure it should be, given the minimal application use. (Other than having the debugger automatically access the symbols for a process, the main use I'd found was with some minor accounting ability, but _that_ depends on it being in the filesystem space; an ioctl access method would be useless in that case.)
This is a code-less change; only a comment has been added.
|
52961 |
07-Nov-1999 |
sef |
Make an incredibly stupid change because Warner threatened to do it and continue doing it despite objections by me (the principal author).
Note that this doesn't fix the real problem -- the real problem is generally bad setup by ignorant users, and education is the right way to fix it.
So while this doesn't actually solve the prolem mentioned in the complaint (since it's still possible to do it via other methods, although they mostly involve a bit more complicity), and there are better methods to do this, nobody was willing or able to provide me with a real world example that couldn't be worked around using the existing permissions and group mechanism. And therefore, security by removing features is the method of the day.
I only had three applications that used it, in any event. One of them would have made debugging easier, but I still haven't finished it, and won't now, so it doesn't really matter.
|
52635 |
29-Oct-1999 |
phk |
useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs.
This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
51791 |
29-Sep-1999 |
marcel |
sigset_t change (part 2 of 5) -----------------------------
The core of the signalling code has been rewritten to operate on the new sigset_t. No methodological changes have been made. Most references to a sigset_t object are through macros (see signalvar.h) to create a level of abstraction and to provide a basis for further improvements.
The NSIG constant has not been changed to reflect the maximum number of signals possible. The reason is that it breaks programs (especially shells) which assume that all signals have a non-null name in sys_signame. See src/bin/sh/trap.c for an example. Instead _SIG_MAXSIG has been introduced to hold the maximum signal possible with the new sigset_t.
struct sigprop has been moved from signalvar.h to kern_sig.c because a) it is only used there, and b) access must be done though function sigprop(). The latter because the table doesn't holds properties for all signals, but only for the first NSIG signals.
signal.h has been reorganized to make reading easier and to add the new and/or modified structures. The "old" structures are moved to signalvar.h to prevent namespace polution.
Especially the coda filesystem suffers from the change, because it contained lines like (p->p_sigmask == SIGIO), which is easy to do for integral types, but not for compound types.
NOTE: kdump (and port linux_kdump) must be recompiled.
Thanks to Garrett Wollman and Daniel Eischen for pressing the importance of changing sigreturn as well.
|
51138 |
11-Sep-1999 |
alfred |
Seperate the export check in VFS_FHTOVP, exports are now checked via VFS_CHECKEXP.
Add fh(open|stat|stafs) syscalls to allow userland to query filesystems based on (network) filehandle.
Obtained from: NetBSD
|
51068 |
07-Sep-1999 |
alfred |
All unimplemented VFS ops now have entries in kern/vfs_default.c that return reasonable defaults.
This avoids confusing and ugly casting to eopnotsupp or making dummy functions. Bogus casting of filesystem sysctls to eopnotsupp() have been removed.
This should make *_vfsops.c more readable and reduce bloat.
Reviewed by: msmith, eivind Approved by: phk Tested by: Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>
|
50477 |
28-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
50061 |
19-Aug-1999 |
marcel |
Let processes retrieve their argv through procfs. Revert to the original behaviour in all other cases.
Submitted by: Andrew Gordon <arg@arg1.demon.co.uk>
|
49525 |
08-Aug-1999 |
bde |
Fixed printf format errors (%qu -> %llu; the arg was already unsigned long long to hide problems on alphas).
|
48719 |
09-Jul-1999 |
phk |
Allow jailed proccesses to open non-process vnodes like the root of the fs.
|
48715 |
09-Jul-1999 |
peter |
Use %q rather than rolling a custom routine.
|
48692 |
09-Jul-1999 |
jlemon |
Support for i386 hardware breakpoints.
Submitted by: Brian Dean <brdean@unx.sas.com>
|
48691 |
09-Jul-1999 |
jlemon |
Implement support for hardware debug registers on the i386.
Submitted by: Brian Dean <brdean@unx.sas.com>
|
47897 |
13-Jun-1999 |
phk |
Eliminate the bogus procfs private almost struct dirent structure.
Spotted by: Lars Hamren Reviewed by: bde
|
47407 |
22-May-1999 |
dt |
Don't call calcru() on a swapped-out process. calcru() access p_stats, which is in U-area.
|
46389 |
04-May-1999 |
phk |
Make the type and map files claim 0 bytes size. Tar doesn't get confused now, but doesn't store any data eiter.
I wonder if we shouldn't claim to be fifos instead...
|
46388 |
04-May-1999 |
phk |
Add even more () to CHECKIO which by now feels positively LISPish.
Submitted by: bde Reviewed by: phk
|
46201 |
30-Apr-1999 |
phk |
Add a new "file" to procfs: "rlimit" which shows the resource limits for the process.
PR: 11342 Submitted by: Adrian Chadd adrian@freebsd.org Reviewed by: phk
|
46155 |
28-Apr-1999 |
phk |
This Implements the mumbled about "Jail" feature.
This is a seriously beefed up chroot kind of thing. The process is jailed along the same lines as a chroot does it, but with additional tough restrictions imposed on what the superuser can do.
For all I know, it is safe to hand over the root bit inside a prison to the customer living in that prison, this is what it was developed for in fact: "real virtual servers".
Each prison has an ip number associated with it, which all IP communications will be coerced to use and each prison has its own hostname.
Needless to say, you need more RAM this way, but the advantage is that each customer can run their own particular version of apache and not stomp on the toes of their neighbors.
It generally does what one would expect, but setting up a jail still takes a little knowledge.
A few notes:
I have no scripts for setting up a jail, don't ask me for them.
The IP number should be an alias on one of the interfaces.
mount a /proc in each jail, it will make ps more useable.
/proc/<pid>/status tells the hostname of the prison for jailed processes.
Quotas are only sensible if you have a mountpoint per prison.
There are no privisions for stopping resource-hogging.
Some "#ifdef INET" and similar may be missing (send patches!)
If somebody wants to take it from here and develop it into more of a "virtual machine" they should be most welcome!
Tools, comments, patches & documentation most welcome.
Have fun...
Sponsored by: http://www.rndassociates.com/ Run for almost a year by: http://www.servetheweb.com/
|
46116 |
27-Apr-1999 |
phk |
Change suser_xxx() to suser() where it applies.
|
46112 |
27-Apr-1999 |
phk |
Suser() simplification:
1: s/suser/suser_xxx/
2: Add new function: suser(struct proc *), prototyped in <sys/proc.h>.
3: s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/
The remaining suser_xxx() calls will be scrutinized and dealt with later.
There may be some unneeded #include <sys/cred.h>, but they are left as an exercise for Bruce.
More changes to the suser() API will come along with the "jail" code.
|
44146 |
19-Feb-1999 |
luoqi |
Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper.
Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>
|
43748 |
07-Feb-1999 |
dillon |
Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to attempt to optimize forks but were essentially given-up on due to problems and replaced with an explicit dup of the vm_map_entry structure. Prior to the removal, they were entirely unused.
|
43634 |
05-Feb-1999 |
jdp |
Correct a format mismatch on 64-bit architectures. This should fix the erroneous values in the procfs "map" file on the Alpha.
|
43305 |
27-Jan-1999 |
dillon |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
42957 |
21-Jan-1999 |
dillon |
This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues.
Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
|
42301 |
05-Jan-1999 |
peter |
A partial implementation of the procfs cmdline pseudo-file. This is enough to satisfy things like StarOffice. This is a hack, but doing it properly would be a LOT of work, and would require extensive grovelling around in the user address space to find the argv[].
Obtained from: Mostly from Andrzej Bialecki <abial@nask.pl>.
|
41514 |
04-Dec-1998 |
archie |
Examine all occurrences of sprintf(), strcat(), and str[n]cpy() for possible buffer overflow problems. Replaced most sprintf()'s with snprintf(); for others cases, added terminating NUL bytes where appropriate, replaced constants like "16" with sizeof(), etc.
These changes include several bug fixes, but most changes are for maintainability's sake. Any instance where it wasn't "immediately obvious" that a buffer overflow could not occur was made safer.
Reviewed by: Bruce Evans <bde@zeta.org.au> Reviewed by: Matthew Dillon <dillon@apollo.backplane.com> Reviewed by: Mike Spengler <mks@networkcs.com>
|
40700 |
28-Oct-1998 |
dg |
Added a second argument, "activate" to the vm_page_unwire() call so that the caller can select either inactive or active queue to put the page on.
|
38909 |
07-Sep-1998 |
bde |
Removed statically configured mount type numbers (MOUNT_*) and all references to them.
The change a couple of days ago to ignore these numbers in statically configured vfsconf structs was slightly premature because the cd9660, cfs, devfs, ext2fs, nfs vfs's still used MOUNT_* instead of the number in their vfsconf struct.
|
37898 |
27-Jul-1998 |
alex |
Style fixes and a bug fix: don't remove the exit handler if unmount fails.
Submitted by: bde
|
37877 |
27-Jul-1998 |
alex |
A better solution to the rm_at_exit problem: Register the exit function during first mount. Unregister the exit function at last unmount.
Concept by: sef Reviewed by: sef Implemented by: alex
|
37864 |
25-Jul-1998 |
alex |
Override the default VFS LKM dispatch functions so that a module unload function can be provided (this is necessary to unregister the at_exit handler).
|
37649 |
15-Jul-1998 |
bde |
Cast pointers to uintptr_t/intptr_t instead of to u_long/long, respectively. Most of the longs should probably have been u_longs, but this changes is just to prevent warnings about casts between pointers and integers of different sizes, not to fix poorly chosen types.
|
37555 |
11-Jul-1998 |
bde |
Fixed printf format errors.
|
37465 |
07-Jul-1998 |
bde |
Quick fix for type mismatches which were fatal if longs aren't 32 bits. We used a private, wrong, version of `struct dirent' to help break getdirentries(), and we use a silly check that the size of this struct is a power of 2 to help break mount() if getdirentries() would not work. This fix just changes the struct to match `struct dirent' (except for the name length).
|
37154 |
25-Jun-1998 |
dt |
Remove "not hungly" panics. Cookies now used by the linux and ibcs2 emulators. The emulators assume that filesystem may just ignore cookies, and handle this case correctly. So we just ignore cookies.
Also sync *_readdir "prototypes" with reality.
|
36969 |
14-Jun-1998 |
bde |
Avoid a 64-bit division in procfs_readdir(). Fixed related overflows. Check args using the same expression as in fdesc and kernfs. The check was actually already correct, modulo overflow. It could be tightened up to either allow huge (aligned) offsets, treating them as EOF, or disallow all offsets beyond EOF.
Didn't fix invalid address calculation &foo[i] where i may be out of bounds.
Didn't fix shooting of foot using a private unportable dirent struct.
|
36840 |
10-Jun-1998 |
peter |
Don't silently accept attempts to change flags where they are not supported.
|
36735 |
07-Jun-1998 |
dfr |
This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change.
The prototype FreeBSD/alpha machdep will follow in a couple of days time.
|
36168 |
19-May-1998 |
tegge |
Disallow reading the current kernel stack. Only the user structure and the current registers should be accessible. Reviewed by: David Greenman <dg@root.com>
|
35769 |
06-May-1998 |
msmith |
As described by the submitter:
Reverse the VFS_VRELE patch. Reference counting of vnodes does not need to be done per-fs. I noticed this while fixing vfs layering violations. Doing reference counting in generic code is also the preference cited by John Heidemann in recent discussions with him.
The implementation of alternative vnode management per-fs is still a valid requirement for some filesystems but will be revisited sometime later, most likely using a different framework.
Submitted by: Michael Hancock <michaelh@cet.co.jp>
|
35497 |
29-Apr-1998 |
dyson |
Tighten up management of memory and swap space during map allocation, deallocation cycles. This should provide a measurable improvement on swap and memory allocation on loaded systems. It is unlikely a complete solution. Also, provide more map info with procfs. Chuck Cranor spurred on this improvement.
|
35256 |
17-Apr-1998 |
des |
Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108.
|
34901 |
26-Mar-1998 |
phk |
Add two new functions, get{micro|nano}time.
They are atomic, but return in essence what is in the "time" variable. gettime() is now a macro front for getmicrotime().
Various patches to use the two new functions instead of the various hacks used in their absence.
Some puntuation and grammer patches from Bruce.
A couple of XXX comments.
|
33964 |
01-Mar-1998 |
msmith |
The intent is to get rid of WILLRELE in vnode_if.src by making a complement to all ops that return a vpp, VFS_VRELE. This is initially only for file systems that implement the following ops that do a WILLRELE:
vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link, vop_rename, vop_mkdir, vop_rmdir, vop_symlink
This is initial DNA that doesn't do anything yet. VFS_VRELE is implemented but not called.
A default vfs_vrele was created for fs implementations that use the standard vnode management routines.
VFS_VRELE implementations were made for the following file systems:
Standard (vfs_vrele) ffs mfs nfs msdosfs devfs ext2fs
Custom union umapfs
Just EOPNOTSUPP fdesc procfs kernfs portal cd9660
These implementations may change as VOP changes are implemented.
In the next phase, in the vop implementations calls to vrele and the vrele part of vput will be moved to the top layer vfs_vnops and made visible to all layers. vput will be replaced by unlock in these cases. Unlocking will still be done in the per fs layer but the refcount decrement will be triggered at the top because it doesn't hurt to hold a vnode reference a little longer. This will have minimal impact on the structure of the existing code.
This will only be done for vnode arguments that are released by the various fs vop implementations.
Wider use of VFS_VRELE will likely require restructuring of the code.
Reviewed by: phk, dyson, terry et. al. Submitted by: Michael Hancock <michaelh@cet.co.jp>
|
33181 |
09-Feb-1998 |
eivind |
Staticize.
|
33134 |
06-Feb-1998 |
eivind |
Back out DIAGNOSTIC changes.
|
33108 |
04-Feb-1998 |
eivind |
Turn DIAGNOSTIC into a new-style option.
|
32702 |
22-Jan-1998 |
dyson |
VM level code cleanups.
1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM.
This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.)
This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)
|
32286 |
06-Jan-1998 |
dyson |
Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does.
When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex.
When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore.
A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes.
Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.
|
32285 |
06-Jan-1998 |
sef |
Use CHECKIO in procfs_ioctl() to ensure that any changes in UID/GID result in the expected failure.
|
32120 |
30-Dec-1997 |
bde |
Fixed a missing/misplaced/misstyled prototype.
|
32011 |
27-Dec-1997 |
bde |
Unspammed nested include of <vm/vm_zone.h>.
|
31891 |
20-Dec-1997 |
sef |
Clear the p_stops field on change of user/group id, unless the correct flag is set in the p_pfsflags field. This, essentially, prevents an SUID proram from hanging after being traced. (E.g., "truss /usr/bin/rlogin" would fail, but leave rlogin in a stopevent state.) Yet another case where procctl is (hopefully ;)) no longer needed in the general case.
Reviewed by: bde (thanks bruce :))
|
31691 |
13-Dec-1997 |
sef |
Change the ioctls for procfs around a bit; in particular, whever possible, change from
ioctl(fd, PIOC<foo>, &i);
to
ioctl(fd, PIOC<foo>, i);
This is going from the _IOW to _IO ioctl macro. The kernel, procctl, and truss must be in synch for it all to work (not doing so will get errors about inappropriate ioctl's, fortunately). Hopefully I didn't forget anything :).
|
31674 |
12-Dec-1997 |
sef |
Fix a problem with procfs_exit() that resulted in missing some procfs nodes; this also apparantly caused a panic in some circumstances. Also, since procfs_exit() is getting rid of the nodes when a process exits, don't bother checking for the process' existance in procfs_inactive().
|
31640 |
09-Dec-1997 |
sef |
Code to prevent a panic caused by procfs_exit(). Note that i don't know what is teh root cause -- but, sometimes, a procfs vnode in pfshead is apparantly corrupt (or a UFS vnode instead). Without this patch, I can get it to panic by doing (in csh)
while (1) ps auxwww end
and it will panic when the PID's wrap. With it, it does not panic. Yes -- I know that this is NOT the right way to fix it. But I haven't been able to get it to panic yet (which confuses me). I am going to be looking into the vgone() code now, as that may be a part of it.
|
31636 |
08-Dec-1997 |
sef |
A couple of fixes from bruce: first of all, psignal is a void (stupid me; unfortunately, also makes it hard ot check for errors); second, I had managed to forget a change to PIOCSFL (it should be _IOW, not _IOR) I had in my local copy, and Bruce called me on it.
Submitted by: bde
|
31618 |
08-Dec-1997 |
sef |
Use at_exit() to invoke procfs_exit() instead of calling it directly. Note that an unload facility should be used to call rm_at_exit() (if procfs is being loaded as an LKM and is subsequently removed), but it was non-obvious how to do this in the VFS framework.
Reviewed by: Julian Elischer
|
31595 |
07-Dec-1997 |
sef |
Clear the stop events and wakeup the process on teh last close of the procfs/mem file. While this doesn't prevent an unkillable process, it means that a broken truss prorgam won't do it accidently now (well, there's a small window of opportunity). Note that this requires the change to truss I am about to commit.
|
31564 |
06-Dec-1997 |
sef |
Changes to allow event-based process monitoring and control.
|
31561 |
05-Dec-1997 |
bde |
Don't include <sys/lock.h> in headers when only `struct simplelock' is required. Fixed everything that depended on the pollution.
|
31174 |
14-Nov-1997 |
tegge |
Don't try to obtain an excluive lock on the vm map, since a deadlock might occur if the process owning the map is wiring pages.
|
31016 |
07-Nov-1997 |
phk |
Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by: -Wunused
|
30785 |
27-Oct-1997 |
bde |
KNFize rev.1.31.
|
30780 |
27-Oct-1997 |
bde |
Removed unused #includes. The need for most of them went away with recent changes (docluster* and vfs improvements).
|
30743 |
26-Oct-1997 |
phk |
VFS interior redecoration.
Rename vn_default_error to vop_defaultop all over the place. Move vn_bwrite from vfs_bio.c to vfs_default.c and call it vop_stdbwrite. Use vop_null instead of nullop. Move vop_nopoll from vfs_subr.c to vfs_default.c Move vop_sharedlock from vfs_subr.c to vfs_default.c Move vop_nolock from vfs_subr.c to vfs_default.c Move vop_nounlock from vfs_subr.c to vfs_default.c Move vop_noislocked from vfs_subr.c to vfs_default.c Use vop_ebadf instead of *_ebadf. Add vop_defaultop for getpages on master vnode in MFS.
|
30496 |
16-Oct-1997 |
phk |
VFS clean up "hekto commit"
1. Add defaults for more VOPs VOP_LOCK vop_nolock VOP_ISLOCKED vop_noislocked VOP_UNLOCK vop_nounlock and remove direct reference in filesystems.
2. Rename the nfsv2 vnop tables to improve sorting order.
|
30492 |
16-Oct-1997 |
phk |
Another VFS cleanup "kilo commit"
1. Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS} intereface function, and now lives in the ufsmount structure.
2. Remove VOP_SEEK, it was unused.
3. Add mode default vops:
VOP_ADVLOCK vop_einval VOP_CLOSE vop_null VOP_FSYNC vop_null VOP_IOCTL vop_enotty VOP_MMAP vop_einval VOP_OPEN vop_null VOP_PATHCONF vop_einval VOP_READLINK vop_einval VOP_REALLOCBLKS vop_eopnotsupp
And remove identical functionality from filesystems
4. Add vop_stdpathconf, which returns the canonical stuff. Use it in the filesystems. (XXX: It's probably wrong that specfs and fifofs sets this vop, shouldn't it come from the "host" filesystem, for instance ufs or cd9660 ?)
5. Try to make system wide VOP functions have vop_* names.
6. Initialize the um_* vectors in LFS.
(Recompile your LKMS!!!)
|
30474 |
16-Oct-1997 |
phk |
VFS mega cleanup commit (x/N)
1. Add new file "sys/kern/vfs_default.c" where default actions for VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE, POLL, REVOKE and STRATEGY. Various stuff spread over the entire tree belongs here.
2. Change VOP_BLKATOFF to a normal function in cd9660.
3. Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC. These are private interface functions between UFS and the underlying storage manager layer (FFS/LFS/MFS/EXT2FS). The functions now live in struct ufsmount instead.
4. Remove a kludge of VOP_ functions in all filesystems, that did nothing but obscure the simplicity and break the expandability. If a filesystem doesn't implement VOP_FOO, it shouldn't have an entry for it in its vnops table. The system will try to DTRT if it is not implemented. There are still some cruft left, but the bulk of it is done.
5. Fix another VCALL in vfs_cache.c (thanks Bruce!)
|
30434 |
15-Oct-1997 |
phk |
Hmm, realign the vnops into two columns.
|
30431 |
15-Oct-1997 |
phk |
Stylistic overhaul of vnops tables. 1. Remove comment stating the blatantly obvious. 2. Align in two columns. 3. Sort all but the default element alphabetically. 4. Remove XXX comments pointing out entries not needed.
|
29653 |
21-Sep-1997 |
dyson |
Change the M_NAMEI allocations to use the zone allocator. This change plus the previous changes to use the zone allocator decrease the useage of malloc by half. The Zone allocator will be upgradeable to be able to use per CPU-pools, and has more intelligent usage of SPLs. Additionally, it has reasonable stats gathering capabilities, while making most calls inline.
|
29362 |
14-Sep-1997 |
peter |
Convert select -> poll. Delete 'always succeed' select/poll handlers, replaced with generic call. Flag missing vnode op table entries.
|
29179 |
07-Sep-1997 |
bde |
Some staticized variables were still declared to be extern.
|
28270 |
16-Aug-1997 |
wollman |
Fix all areas of the system (or at least all those in LINT) to avoid storing socket addresses in mbufs. (Socket buffers are the one exception.) A number of kernel APIs needed to get fixed in order to make this happen. Also, fix three protocol families which kept PCBs in mbufs to not malloc them instead. Delete some old compatibility cruft while we're at it, and add some new routines in the in_cksum family.
|
28089 |
12-Aug-1997 |
sef |
Check permissions for fp regs as well as normal regs.
|
28086 |
12-Aug-1997 |
sef |
Fix procfs security hole -- check permissions on meaningful I/Os (namely, reading/writing of mem and regs). Also have to check for the requesting process being group KMEM -- this is a bit of a hack, but ps et al need it.
Reviewed by: davidg
|
27845 |
02-Aug-1997 |
bde |
Removed unused #includes.
|
26962 |
26-Jun-1997 |
alex |
Style fix my previous commit.
|
26769 |
21-Jun-1997 |
alex |
Block all write operations to /proc/1/* when securelevel > 0. The additional check in procfs_ctl.c could be backed out, but I'm leaving it in for good measure.
Reviewed by: Theo de Raadt <deraadt@OpenBSD.org>
|
25207 |
27-Apr-1997 |
alex |
Removed bogon from previous commit: doubly included sys/systm.h.
|
25200 |
27-Apr-1997 |
alex |
Prevent debugger attachment to init when securelevel > 0.
Noticed by: Brian Buchanan <brian@wasteland.calbbs.com>
|
25055 |
20-Apr-1997 |
dyson |
Fix both a problem with accessing backing objects, and also release the process map on nonexistant pages. PR: kern/3327 Submitted by: Tor Egge <Tor.Egge@idi.ntnu.no>
|
24666 |
06-Apr-1997 |
dyson |
Fix the gdb executable modify problem. Thanks to the detective work by Alan Cox <alc@cs.rice.edu>, and his description of the problem.
The bug was primarily in procfs_mem, but the mistake likely happened due to the lack of vm system support for the operation. I added better support for selective marking of page dirty flags so that vm_map_pageable(wiring) will not cause this problem again.
The code in procfs_mem is now less bogus (but maybe still a little so.)
|
24203 |
24-Mar-1997 |
bde |
Don't include <sys/ioctl.h> in the kernel. Stage 1: don't include it when it is not used. In most cases, the reasons for including it went away when the special ioctl headers became self-sufficient.
|
23526 |
08-Mar-1997 |
bde |
Fixed missing initialisation of vp->v_type for types Pfile and Pmem in procfs_allocvp(). This fixes at least stat() of /proc/*/mem.
stat() of /proc/*/file already worked. I think procfs_allocvp() isn't actually called for type Pfile.
|
23077 |
24-Feb-1997 |
bde |
Fixed procfs's locking vops. They were missed in the Lite2 merge, partly because the #define's for them were moved to a different file. At least the null VOP_LOCK() no longer works, since vclean() expects VOP_LOCK( ..., LK_DRAIN | LK_INTERLOCK, ...) to clear the interlock. This probably only matters when simple_lock() is not null, i.e., when there are multiple CPUs or SIMPLELOCK_DEBUG is defined.
|
22975 |
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
22579 |
12-Feb-1997 |
mpp |
Add function prototypes for most of the new Lite2 functions. Also made a few of the miscfs routines static to be consistent. Some modules simply required some additional #includes to remove -Wall warnings.
|
22521 |
10-Feb-1997 |
dyson |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed.
Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
21754 |
16-Jan-1997 |
dyson |
Change the map entry flags from bitfields to bitmasks. Allows for some code simplification.
|
21673 |
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
19261 |
30-Oct-1996 |
dyson |
Fix a potential deadlock from the previous commit.
|
19260 |
30-Oct-1996 |
dyson |
Fix the /proc/???/map file so that it is possible to read an arbitrarily large process map. Another commit will follow to fix a problem just found during this one... Sorry!!! :-(.
|
19141 |
24-Oct-1996 |
dyson |
Fix setting breakpoints in shared regions.
|
18020 |
03-Sep-1996 |
bde |
Eliminated nested include of <sys/unistd.h> in <sys/file.h> in the kernel. Include it directly in the few places where it is used.
Reduced some #includes of <sys/file.h> to #includes of <sys/fcntl.h> or nothing.
|
17974 |
31-Aug-1996 |
bde |
Fixed the easy cases of const poisoning in the kernel. Cosmetic.
|
17306 |
27-Jul-1996 |
dyson |
Modify slightly the output from the map file in /proc. Now the executable bit is shown.
|
17303 |
27-Jul-1996 |
dyson |
Under certain circumstances, reading the /proc/*/map file can crash the system. Nonexistant objects were not handled correctly.
|
16901 |
02-Jul-1996 |
dyson |
Implement locking for pfs nodes, when at the leaf. Concurrent access to information from a single process causes hangs. Specifically, this fixes problems (hangs) with concurrent ps commands, when the system is under heavy memory load. Reviewed by: davidg
|
16889 |
02-Jul-1996 |
dyson |
Fix a serious problem, with a window where an object lock is needed, but not there. The extent of the object lock is expanded to be over the range that it is needed. Additionally, clean up the code so that it conforms to better coding style.
|
16476 |
18-Jun-1996 |
dyson |
Add procfs_type.c to the repository.
|
16474 |
18-Jun-1996 |
dyson |
Clean-up the new VM map procfs code, and also add support for executable format file "etype". It contains a description of the binary type for a process.
|
16468 |
17-Jun-1996 |
dyson |
This file is the "meat" of the process address space capability. If you would like other things added, just ask!!! It might be pretty easy to add.
|
16467 |
17-Jun-1996 |
dyson |
Add a feature to procfs to allow display of the process address map with multiple entries as follows:
start address, end address, resident pages in range, private pages in range, RW/RO, COW or not, (vnode/device/swap/default).
|
16312 |
12-Jun-1996 |
dg |
Moved the fsnode MALLOC to before the call to getnewvnode() so that the process won't possibly block before filling in the fsnode pointer (v_data) which might be dereferenced during a sync since the vnode is put on the mnt_vnodelist by getnewvnode.
Pointed out by Matt Day <mday@artisoft.com>
|
16308 |
11-Jun-1996 |
dyson |
Properly lock the vm space when accessing the memory in a process. This fix could solve some "interesting" problems that could happen during process rundown.
|
14532 |
11-Mar-1996 |
hsu |
For Lite2: proc LIST changes. Reviewed by: davidg & bde
|
13838 |
02-Feb-1996 |
wosch |
add ruid and rgid to file 'status'
|
13627 |
25-Jan-1996 |
peter |
This time, really make the procfs work when reading stuff from the UPAGES.
This is a really ugly bandaid on the problem, but it works well enough for 'ps -u' to start working again. The problem was caused by the user address space shrinking by a little bit and the UPAGES being "cast off" to become a seperate entity rather than being at the top of the process's vmspace. That optimization was part of John's most recent VM speedups.
Now, rather than decoding the VM space, it merely ensures the pages are in core and accesses them the same way the ptrace(PT_READ_U..) code does, ie: off the p->p_addr pointer.
|
13608 |
24-Jan-1996 |
peter |
Major fixes for procfs..
Implement a "variable" directory structure. Files that do not make sense for the given process do not "appear" and cannot be opened. For example, "system" processes do not have "file", "regs" or "fpregs", because they do not have a user area.
"attempt" to fill in the user area of a given process when it is being accessed via /proc/pid/mem (the user struct is just after VM_MAXUSER_ADDRESS in the process address space.)
Dont do IO to the U area while it's swapped, hold it in place if possible.
Lock off access to the "ctl" file if it's done a setuid like the other pseudo-files in there.
|
13490 |
19-Jan-1996 |
dyson |
Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
|
12904 |
17-Dec-1995 |
bde |
Fixed 1TB filesize changes. Some pindexes had bogus names and types but worked because vm_pindex_t is indistinuishable from vm_offset_t.
|
12767 |
11-Dec-1995 |
dyson |
Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.
|
12662 |
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
12595 |
03-Dec-1995 |
bde |
Added prototypes.
Removed some unnecessary #includes.
|
12337 |
16-Nov-1995 |
bde |
Moved declarations for static functions to the correct place (not in a header).
Removed stupid comments.
|
12336 |
16-Nov-1995 |
bde |
Fixed the type of procfs_sync(). Trailing args were missing.
Fixed the type of procfs_fhtovp(). The args had little resemblance to the correct ones.
Added prototypes.
|
12158 |
09-Nov-1995 |
bde |
Introduced a type `vop_t' for vnode operation functions and used it 1138 times (:-() in casts and a few more times in declarations. This change is null for the i386.
The type has to be `typedef int vop_t(void *)' and not `typedef int vop_t()' because `gcc -Wstrict-prototypes' warns about the latter. Since vnode op functions are called with args of different (struct pointer) types, neither of these function types is any use for type checking of the arg, so it would be preferable not to use the complete function type, especially since using the complete type requires adding 1138 casts to avoid compiler warnings and another 40+ casts to reverse the function pointer conversions before calling the functions.
|
12143 |
07-Nov-1995 |
phk |
Make a lot of private stuff static. Should anybody out there wonder about this vendetta against global variables, it is basically to make it more visible what our interfaces in the kernel really are. I'm almost convinced we should have a #define PUBLIC /* public interface */ and use it in the #includes...
|
11707 |
23-Oct-1995 |
dyson |
Removal of unnecessary usage of PG_COPYONWRITE.
|
10531 |
02-Sep-1995 |
mpp |
Change procfs_lookup to not allow delete/rename operations to prevent panics when a user tries to remove/rename the contents of /proc/###/*.
Obtained from: 4.4BSD-lite2
|
10024 |
11-Aug-1995 |
dg |
Be careful not to dereference NULL credentials pointers when doing the getattr function.
|
9540 |
16-Jul-1995 |
bde |
Don't include <sys/tty.h> in drivers that aren't tty drivers or in general files that don't depend on the internals of <sys/tty.h>
|
9507 |
13-Jul-1995 |
dg |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!!
Much needed overhaul of the VM system. Included in this first round of changes:
1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers".
2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items.
3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed.
4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug.
5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance.
6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain.
7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance.
8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed.
9) Some almost useless debugging code removed.
10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology.
11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended.
12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course).
13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE.
14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes)
TODO:
1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size.
2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness.
3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind.
4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems.
5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
9346 |
28-Jun-1995 |
dg |
Killed the "probably_never" ifdef'd code.
|
8876 |
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
8740 |
25-May-1995 |
dg |
Fixed panic that resulted from mmaping files in kernfs and procfs. A regular user could panic the machine with a simple "tail /proc/curproc/mem" command. The problem was twofold: both kernfs and procfs didn't fill in the mnt_stat statfs struct (which would later lead to an integer divide fault in the vnode pager), and kernfs bogusly paniced if a bmap was attempted.
Reviewed by: John Dyson
|
8456 |
11-May-1995 |
rgrimes |
Fix -Wformat warnings from LINT kernel.
|
7835 |
15-Apr-1995 |
dg |
For P_SUGID processes, we must also change ownership of the mem file to root so that group kmem can still get to it. *SIGH*
|
7833 |
15-Apr-1995 |
dg |
Retain group kmem readability for P_SUGID processes.
|
7832 |
15-Apr-1995 |
dg |
Made /proc/n/mem file group kmem and group readable. Needed to fix ps so that it doesn't need to be setuid root.
|
7095 |
16-Mar-1995 |
wollman |
Add four more filesystem flags:
VFCF_NETWORK (this FS goes over the net) VFCF_READONLY (read-write mounts do not make any sense) VFCF_SYNTHETIC (data in this FS is not real) VFCF_LOOPBACK (this FS aliases something else)
cd9660 is readonly; nullfs, umapfs, and union are loopback; NFS is netowkr; procfs, kernfs, and fdesc are synthetic.
|
7090 |
16-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
6569 |
20-Feb-1995 |
dg |
Make sure process isn't swapped when messing with it. Added missing newline to log() call.
|
6151 |
03-Feb-1995 |
dg |
Fixed bmap run-length brokeness. Use bmap run-length extension when doing clustered paging.
Submitted by: John Dyson
|
5403 |
05-Jan-1995 |
dg |
Initialize map start hint to vm_map_find()...not doing so will cause it to fail if the random thing on the stack happens to be too large.
Submitted by: David Jones <dej@qpoint.torfree.net>
|
5312 |
31-Dec-1994 |
ache |
Fix problem when attached process detached Submitted by: Gary Jennejohn
|
3687 |
18-Oct-1994 |
dg |
Fixed bug I just introduced that would have allowed a user to clobber his kernel stack.
|
3685 |
18-Oct-1994 |
dg |
Allow upages to be paged in/accessed.
Submitted by: John Dyson
|
3496 |
10-Oct-1994 |
phk |
Cosmetics. reduce the noise from gcc -Wall.
|
3396 |
06-Oct-1994 |
dg |
Use tsleep() rather than sleep so that 'ps' is more informative about the wait.
|
3054 |
24-Sep-1994 |
dg |
1) Added "." and ".." entries. 2) Fixed directory size to return something reasonable. 3) Disabled "file" until the code is completed. 4) Corrected directory link counts.
|
2946 |
21-Sep-1994 |
wollman |
Implemented loadable VFS modules, and made most existing filesystems loadable. (NFS is a notable exception.)
|
2807 |
15-Sep-1994 |
bde |
Supply prototypes for some functions that were implicitly declared and fix the resulting warnings.
|
2112 |
18-Aug-1994 |
wollman |
Fix up some sloppy coding practices:
- Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above.
NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
|
1817 |
02-Aug-1994 |
dg |
Added $Id$
|
1549 |
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
1541 |
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|