#
290362 |
|
04-Nov-2015 |
glebius |
o Fix regressions related to SA-15:25 upgrade of NTP. [1] o Fix kqueue write events never fired for files greater 2GB. [2] o Fix kpplications exiting due to segmentation violation on a correct memory address. [3]
PR: 204046 [1] PR: 204203 [1] Errata Notice: FreeBSD-EN-15:19.kqueue [2] Errata Notice: FreeBSD-EN-15:20.vm [3] Approved by: so
|
#
272461 |
|
02-Oct-2014 |
gjb |
Copy stable/10@r272459 to releng/10.1 as part of the 10.1-RELEASE process.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
272202 |
|
27-Sep-2014 |
kib |
MFC r272036: Avoid calling vm_map_pmap_enter() for the MADV_WILLNEED on the wired entry, the pages must be already mapped.
Approved by: re (gjb)
|
#
270920 |
|
01-Sep-2014 |
kib |
Fix a leak of the wired pages when unwiring of the PROT_NONE-mapped wired region. Rework the handling of unwire to do the it in batch, both at pmap and object level.
All commits below are by alc.
MFC r268327: Introduce pmap_unwire().
MFC r268591: Implement pmap_unwire() for powerpc.
MFC r268776: Implement pmap_unwire() for arm.
MFC r268806: pmap_unwire(9) man page.
MFC r269134: When unwiring a region of an address space, do not assume that the underlying physical pages are mapped by the pmap. This fixes a leak of the wired pages on the unwiring of the region mapped with no access allowed.
MFC r269339: In the implementation of the new function pmap_unwire(), the call to MOEA64_PVO_TO_PTE() must be performed before any changes are made to the PVO. Otherwise, MOEA64_PVO_TO_PTE() will panic.
MFC r269365: Correct a long-standing problem in moea{,64}_pvo_enter() that was revealed by the combination of r268591 and r269134: When we attempt to add the wired attribute to an existing mapping, moea{,64}_pvo_enter() do nothing. (They only set the wired attribute on newly created mappings.)
MFC r269433: Handle wiring failures in vm_map_wire() with the new functions pmap_unwire() and vm_object_unwire(). Retire vm_fault_{un,}wire(), since they are no longer used.
MFC r269438: Rewrite a loop in vm_map_wire() so that gcc doesn't think that the variable "rv" is uninitialized.
MFC r269485: Retire pmap_change_wiring().
Reviewed by: alc
|
#
269072 |
|
24-Jul-2014 |
kib |
MFC r267213 (by alc): Add a page size field to struct vm_page.
Approved by: alc
|
#
267956 |
|
27-Jun-2014 |
kib |
MFC r267664: Assert that the new entry is inserted into the right location in the map entries list, and that it does not overlap with the previous and next entries.
|
#
267901 |
|
26-Jun-2014 |
kib |
MFC r267630: Add MAP_EXCL flag for mmap(2).
|
#
267899 |
|
26-Jun-2014 |
kib |
MFC r267766: Use correct names for the flags.
|
#
267772 |
|
23-Jun-2014 |
kib |
MFC r267254: Make mmap(MAP_STACK) search for the available address space.
MFC r267497 (by alc): Use local variable instead of sgrowsiz.
|
#
267059 |
|
04-Jun-2014 |
kib |
MFC r266780: Remove the assert which can be triggered by the userspace.
|
#
266589 |
|
23-May-2014 |
alc |
MFC r265886, r265948 With the new-and-improved vm_fault_copy_entry() (r265843), we can always avoid soft page faults when adding write access to user wired entries in vm_map_protect(). Previously, we only avoided the soft page fault when the underlying pages were copy-on-write. In other words, we avoided the pages faults that might sleep on page allocation, but not the trivial page faults to update the physical map.
On a fork allow read-only wired pages to be copy-on-write shared between the parent and child processes. Previously, we copied these pages even though they are read only. However, the reason for copying them is historical and no longer exists. In recent times, vm_map_protect() has developed the ability to copy pages when write access is added to wired copy-on-write pages. So, in this case, copy-on-write sharing of wired pages is not to be feared. It is not going to lead to copy-on-write faults on wired memory.
|
#
266582 |
|
23-May-2014 |
kib |
MFC r266464: In execve(2), postpone the free of old vmspace until the threads are resumed and exited.
|
#
266315 |
|
17-May-2014 |
alc |
MFC r265850 About 9% of the pmap_protect() calls being performed by vm_map_copy_entry() are unnecessary. Eliminate the unnecessary calls.
|
#
266302 |
|
17-May-2014 |
kib |
MFC r265825: When printing the map with the ddb 'show procvm' command, do not dump page queues for the backing objects.
|
#
266299 |
|
17-May-2014 |
kib |
MFC r265824: Print the entry address in addition to the object.
|
#
263684 |
|
24-Mar-2014 |
kib |
MFC r263471: Initialize vm_map_entry member wiring_thread on the map entry creation.
|
#
260081 |
|
30-Dec-2013 |
kib |
MFC r259951: Do not coalesce stack entry. Pass MAP_STACK_GROWS_DOWN and MAP_STACK_GROWS_UP flags to vm_map_insert() from vm_map_stack()
|
#
259299 |
|
13-Dec-2013 |
kib |
MFC r258367: Verify for zero-length requests and act as if it is always successfull without performing any action on the address space.
|
#
259297 |
|
13-Dec-2013 |
kib |
MFC r258366: Add assertions to cover all places in the wiring and unwiring code where MAP_ENTRY_IN_TRANSITION is set or cleared.
|
#
256281 |
|
10-Oct-2013 |
gjb |
Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
#
255793 |
|
22-Sep-2013 |
alc |
Both the vm_map and vmspace zones are defined as "no free". So, there is no point in defining a fini function for these zones.
Reviewed by: kib Approved by: re (glebius) Sponsored by: EMC / Isilon Storage Division
|
#
255732 |
|
20-Sep-2013 |
neel |
Merge the following changes from projects/bhyve_npt_pmap: - add fields to 'struct pmap' that are required to manage nested page tables. - add a parameter to 'vmspace_alloc()' that can be used to override the default pmap initialization routine 'pmap_pinit()'.
These changes are pushed ahead of the remaining changes in 'bhyve_npt_pmap' in anticipation of the upcoming KBI freeze for 10.0.
Reviewed by: kib@, alc@ Approved by: re (glebius)
|
#
255426 |
|
09-Sep-2013 |
jhb |
Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use an address in the first 2GB of the process's address space. This flag should have the same semantics as the same flag on Linux.
To facilitate this, add a new parameter to vm_map_find() that specifies an optional maximum virtual address. While here, fix several callers of vm_map_find() to use a VMFS_* constant for the findspace argument instead of TRUE and FALSE.
Reviewed by: alc Approved by: re (kib)
|
#
255028 |
|
29-Aug-2013 |
alc |
Significantly reduce the cost, i.e., run time, of calls to madvise(..., MADV_DONTNEED) and madvise(..., MADV_FREE). Specifically, introduce a new pmap function, pmap_advise(), that operates on a range of virtual addresses within the specified pmap, allowing for a more efficient implementation of MADV_DONTNEED and MADV_FREE. Previously, the implementation of MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as pmap_clear_reference(). Intuitively, the problem with this implementation is that the pmap-level locks are acquired and released and the page table traversed repeatedly, once for each resident page in the range that was specified to madvise(2). A more subtle flaw with the previous implementation is that pmap_clear_reference() would clear the reference bit on all mappings to the specified page, not just the mapping in the range specified to madvise(2).
Since our malloc(3) makes heavy use of madvise(2), this change can have a measureable impact. For example, the system time for completing a parallel "buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%.
Note: This change only contains pmap_advise() implementations for a subset of our supported architectures. I will commit implementations for the remaining architectures after further testing. For now, a stub function is sufficient because of the advisory nature of pmap_advise().
Discussed with: jeff, jhb, kib Tested by: pho (i386), marcel (ia64) Sponsored by: EMC / Isilon Storage Division
|
#
254667 |
|
22-Aug-2013 |
kib |
Revert r254501. Instead, reuse the type stability of the struct pmap which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE zone. Initialize the pmap lock in the vmspace zone init function, and remove pmap lock initialization and destruction from pmap_pinit() and pmap_release().
Suggested and reviewed by: alc (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation
|
#
254430 |
|
16-Aug-2013 |
jhb |
Add new mmap(2) flags to permit applications to request specific virtual address alignment of mappings. - MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n). Requests for n >= number of bits in a pointer or less than the size of a page fail with EINVAL. This matches the API provided by NetBSD. - MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED. It can be used to optimize the chances of using large pages. By default it will align the mapping on a large page boundary (the system is free to choose any large page size to align to that seems best for the mapping request). However, if the object being mapped is already using large pages, then it will align the virtual mapping to match the existing large pages in the object instead. - Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment. MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE. - mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than explicitly using VMFS_SUPER_SPACE. All device objects are forced to use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively equivalent.
Reviewed by: alc MFC after: 1 month
|
#
254025 |
|
07-Aug-2013 |
jeff |
Replace kernel virtual address space allocation with vmem. This provides transparent layering and better fragmentation.
- Normalize functions that allocate memory to use kmem_* - Those that allocate address space are named kva_* - Those that operate on maps are named kmap_* - Implement recursive allocation handling for kmem_arena in vmem.
Reviewed by: alc Tested by: pho Sponsored by: EMC / Isilon Storage Division
|
#
253636 |
|
25-Jul-2013 |
kientzle |
Clear entire map structure including locks so that the locks don't accidentally appear to have been already initialized.
In particular, this fixes a consistent kernel crash on armv6 with: panic: lock "vm map (user)" 0xc09cc050 already initialized that appeared with r251709.
PR: arm/180820
|
#
253471 |
|
19-Jul-2013 |
jhb |
Be more aggressive in using superpages in all mappings of objects: - Add a new address space allocation method (VMFS_OPTIMAL_SPACE) for vm_map_find() that will try to alter the alignment of a mapping to match any existing superpage mappings of the object being mapped. If no suitable address range is found with the necessary alignment, vm_map_find() will fall back to using the simple first-fit strategy (VMFS_ANY_SPACE). - Change mmap() without MAP_FIXED, shmat(), and the GEM mapping ioctl to use VMFS_OPTIMAL_SPACE instead of VMFS_ANY_SPACE.
Reviewed by: alc (earlier version) MFC after: 2 weeks
|
#
253190 |
|
11-Jul-2013 |
kib |
The mlockall() or VM_MAP_WIRE_HOLESOK does not interact properly with parallel creation of the map entries, e.g. by mmap() or stack growing. It also breaks when other entry is wired in parallel.
The vm_map_wire() iterates over the map entries in the region, and assumes that map entries it finds are marked as in transition before, also that any entry marked as in transition, are marked by the current invocation of vm_map_wire(). This is not true for new entries in the holes.
Add the thread owner of the MAP_ENTRY_IN_TRANSITION flag to struct vm_map_entry. In vm_map_wire() and vm_map_unwire(), only process the entries which transition owner is the current thread.
Reported and tested by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
251901 |
|
18-Jun-2013 |
des |
Fix a bug that allowed a tracing process (e.g. gdb) to write to a memory-mapped file in the traced process's address space even if neither the traced process nor the tracing process had write access to that file.
Security: CVE-2013-2171 Security: FreeBSD-SA-13:06.mmap Approved by: so
|
#
250884 |
|
21-May-2013 |
attilio |
o Relax locking assertions for vm_page_find_least() o Relax locking assertions for pmap_enter_object() and add them also to architectures that currently don't have any o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade operation on the per-object rwlock o Use all the mechanisms above to make vm_map_pmap_enter() to work mostl of the times only with readlocks.
Sponsored by: EMC / Isilon storage division Reviewed by: alc
|
#
249303 |
|
09-Apr-2013 |
kib |
Fix the assertions for the state of the object under the map entry with the MAP_ENTRY_VN_WRITECNT flag: - Move the assertion that verifies the state of the v_writecount and vnp.writecount, under the block where the object is locked. - Check that the object type is OBJT_VNODE before asserting.
Reported by: avg Reviewed by: alc MFC after: 1 week
|
#
248084 |
|
09-Mar-2013 |
attilio |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes.
The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs.
The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example).
Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
#
247360 |
|
26-Feb-2013 |
attilio |
Merge from vmc-playground branch: Replace the sub-optimal uma_zone_set_obj() primitive with more modern uma_zone_reserve_kva(). The new primitive reserves before hand the necessary KVA space to cater the zone allocations and allocates pages with ALLOC_NOOBJ. More specifically: - uma_zone_reserve_kva() does not need an object to cater the backend allocator. - uma_zone_reserve_kva() can cater M_WAITOK requests, in order to serve zones which need to do uma_prealloc() too. - When possible, uma_zone_reserve_kva() uses directly the direct-mapping by uma_small_alloc() rather than relying on the KVA / offset combination.
The removal of the object attribute allows 2 further changes: 1) _vm_object_allocate() becomes static within vm_object.c 2) VM_OBJECT_LOCK_INIT() is removed. This function is replaced by direct calls to mtx_init() as there is no need to export it anymore and the calls aren't either homogeneous anymore: there are now small differences between arguments passed to mtx_init().
Sponsored by: EMC / Isilon storage division Reviewed by: alc (which also offered almost all the comments) Tested by: pho, jhb, davide
|
#
245421 |
|
14-Jan-2013 |
zont |
- Get rid of unused function vmspace_wired_count().
Reviewed by: alc Approved by: kib (mentor) MFC after: 1 week
|
#
245255 |
|
10-Jan-2013 |
zont |
- Reduce kernel size by removing unnecessary pointer indirections.
GENERIC kernel size reduced in 16 bytes and RACCT kernel in 336 bytes.
Suggested by: alc Reviewed by: alc Approved by: kib (mentor) MFC after: 1 week
|
#
244384 |
|
18-Dec-2012 |
zont |
- Fix locked memory accounting for maps with MAP_WIREFUTURE flag. - Add sysctl vm.old_mlock which may turn such accounting off.
Reviewed by: avg, trasz Approved by: kib (mentor) MFC after: 1 week
|
#
244043 |
|
08-Dec-2012 |
alc |
In the past four years, we've added two new vm object types. Each time, similar changes had to be made in various places throughout the machine- independent virtual memory layer to support the new vm object type. However, in most of these places, it's actually not the type of the vm object that matters to us but instead certain attributes of its pages. For example, OBJT_DEVICE, OBJT_MGTDEVICE, and OBJT_SG objects contain fictitious pages. In other words, in most of these places, we were testing the vm object's type to determine if it contained fictitious (or unmanaged) pages.
To both simplify the code in these places and make the addition of future vm object types easier, this change introduces two new vm object flags that describe attributes of the vm object's pages, specifically, whether they are fictitious or unmanaged.
Reviewed and tested by: kib
|
#
243529 |
|
25-Nov-2012 |
alc |
Make a few small changes to vm_map_pmap_enter():
Add detail to the comment describing this function. In particular, describe what MAP_PREFAULT_PARTIAL does.
Eliminate the abrupt change in behavior when the specified address range grows from MAX_INIT_PT pages to MAX_INIT_PT plus one pages. Instead of doing nothing, i.e., preloading no mappings whatsoever, map any resident pages that fall within the start of the specified address range, i.e., [addr, addr + ulmin(size, ptoa(MAX_INIT_PT))).
Long ago, the vm object's list of resident pages was not ordered, so this function had to choose between probing the global hash table of all resident pages and iterating over the vm object's unordered list of resident pages. Now, the list is ordered, so there is no reason for MAP_PREFAULT_PARTIAL to be concerned with the vm object's count of resident changes.
MFC after: 14 days
|
#
242903 |
|
11-Nov-2012 |
attilio |
Fix DDB command "show map XXX": - Check that an argument is always available, otherwise current map printing before to recurse is garbage. - Spit out a message if an argument is not provided. - Remove unread nlines variable. - Use an explicit recursive function, disassociated from the DB_SHOW_COMMAND() body, in order to make clear prototype and recursion of the above mentioned function. The code results now much less obscure.
Submitted by: gianni
|
#
240069 |
|
03-Sep-2012 |
zont |
- After r240026 sgrowsiz should be used in a safer maner.
Approved by: kib (mentor) MCF after: 1 week
|
#
237623 |
|
27-Jun-2012 |
alc |
Add new pmap layer locks to the predefined lock order. Change the names of a few existing VM locks to follow a consistent naming scheme.
|
#
237334 |
|
20-Jun-2012 |
jhb |
Move the per-thread deferred user map entries list into a private list in vm_map_process_deferred() which is then iterated to release map entries. This avoids having a nested vm map unlock operation called from the loop body attempt to recuse into vm_map_process_deferred(). This can happen if the vm_map_remove() triggers the OOM killer.
Reviewed by: alc, kib MFC after: 1 week
|
#
236848 |
|
10-Jun-2012 |
kib |
Use the previous stack entry protection and max protection to correctly propagate the stack execution permissions when stack is grown down.
First, curproc->p_sysent->sv_stackprot specifies maximum allowed stack protection for current ABI, so the new stack entry was typically marked executable always. Second, for non-main stack MAP_STACK mapping, the PROT_ flags should be used which were specified at the mmap(2) call time, and not sv_stackprot.
MFC after: 1 week
|
#
235230 |
|
10-May-2012 |
alc |
Give vm_fault()'s sequential access optimization a makeover.
There are two aspects to the sequential access optimization: (1) read ahead of pages that are expected to be accessed in the near future and (2) unmap and cache behind of pages that are not expected to be accessed again. This revision changes both aspects.
The read ahead optimization is now more effective. It starts with the same initial read window as before, but arithmetically grows the window on sequential page faults. This can yield increased read bandwidth. For example, on one of my machines, a program using mmap() to read a file that is several times larger than the machine's physical memory takes about 17% less time to complete.
The unmap and cache behind optimization is now more selectively applied. The read ahead window must grow to its maximum size before unmap and cache behind is performed. This significantly reduces the number of times that pages are unmapped and cached only to be reactivated a short time later.
The unmap and cache behind optimization now clears each page's referenced flag. Previously, in the case of dirty pages, if the containing file was still mapped at the time that the page daemon examined the dirty pages, they would be reactivated.
From a stylistic standpoint, this revision also cleanly separates the implementation of the read ahead and unmap/cache behind optimizations.
Glanced at: kib MFC after: 2 weeks
|
#
233191 |
|
19-Mar-2012 |
jhb |
Fix madvise(MADV_WILLNEED) to properly handle individual mappings larger than 4GB. Specifically, the inlined version of 'ptoa' of the the 'int' count of pages overflowed on 64-bit platforms. While here, change vm_object_madvise() to accept two vm_pindex_t parameters (start and end) rather than a (start, count) tuple to match other VM APIs as suggested by alc@.
|
#
233100 |
|
17-Mar-2012 |
kib |
In vm_object_page_clean(), do not clean OBJ_MIGHTBEDIRTY object flag if the filesystem performed short write and we are skipping the page due to this.
Propogate write error from the pager back to the callers of vm_pageout_flush(). Report the failure to write a page from the requested range as the FALSE return value from vm_object_page_clean(), and propagate it back to msync(2) to return EIO to usermode.
While there, convert the clearobjflags variable in the vm_object_page_clean() and arguments of the helper functions to boolean.
PR: kern/165927 Reviewed by: alc MFC after: 2 weeks
|
#
232160 |
|
25-Feb-2012 |
alc |
Simplify vmspace_fork()'s control flow by copying immutable data before the vm map locks are acquired. Also, eliminate redundant initialization of the new vm map's timestamp.
Reviewed by: kib MFC after: 3 weeks
|
#
232071 |
|
23-Feb-2012 |
kib |
Account the writeable shared mappings backed by file in the vnode v_writecount. Keep the amount of the virtual address space used by the mappings in the new vm_object un_pager.vnp.writemappings counter. The vnode v_writecount is incremented when writemappings gets non-zero value, and decremented when writemappings is returned to zero.
Writeable shared vnode-backed mappings are accounted for in vm_mmap(), and vm_map_insert() is instructed to set MAP_ENTRY_VN_WRITECNT flag on the created map entry. During deferred map entry deallocation, vm_map_process_deferred() checks for MAP_ENTRY_VN_WRITECOUNT and decrements writemappings for the vm object.
Now, the writeable mount cannot be demoted to read-only while writeable shared mappings of the vnodes from the mount point exist. Also, execve(2) fails for such files with ETXTBUSY, as it should be.
Noted by: tegge Reviewed by: tegge (long time ago, early version), alc Tested by: pho MFC after: 3 weeks
|
#
231526 |
|
11-Feb-2012 |
kib |
Close a race due to dropping of the map lock between creating map entry for a shared mapping and marking the entry for inheritance. Other thread might execute vmspace_fork() in between (e.g. by fork(2)), resulting in the mapping becoming private.
Noted and reviewed by: alc MFC after: 1 week
|
#
227788 |
|
21-Nov-2011 |
attilio |
Introduce the same mutex-wise fix in r227758 for sx locks.
The functions that offer file and line specifications are: - sx_assert_ - sx_downgrade_ - sx_slock_ - sx_slock_sig_ - sx_sunlock_ - sx_try_slock_ - sx_try_xlock_ - sx_try_upgrade_ - sx_unlock_ - sx_xlock_ - sx_xlock_sig_ - sx_xunlock_
Now vm_map locking is fully converted and can avoid to know specifics about locking procedures. Reviewed by: kib MFC after: 1 month
|
#
227758 |
|
20-Nov-2011 |
attilio |
Introduce macro stubs in the mutex implementation that will be always defined and will allow consumers, willing to provide options, file and line to locking requests, to not worry about options redefining the interfaces. This is typically useful when there is the need to build another locking interface on top of the mutex one.
The introduced functions that consumers can use are: - mtx_lock_flags_ - mtx_unlock_flags_ - mtx_lock_spin_flags_ - mtx_unlock_spin_flags_ - mtx_assert_ - thread_lock_flags_
Spare notes: - Likely we can get rid of all the 'INVARIANTS' specification in the ppbus code by using the same macro as done in this patch (but this is left to the ppbus maintainer) - all the other locking interfaces may require a similar cleanup, where the most notable case is sx which will allow a further cleanup of vm_map locking facilities - The patch should be fully compatible with older branches, thus a MFC is previewed (infact it uses all the underlying mechanisms already present).
Comments review by: eadler, Ben Kaduk Discussed with: kib, jhb MFC after: 1 month
|
#
223825 |
|
06-Jul-2011 |
trasz |
All the racct_*() calls need to happen with the proc locked. Fixing this won't happen before 9.0. This commit adds "#ifdef RACCT" around all the "PROC_LOCK(p); racct_whatever(p, ...); PROC_UNLOCK(p)" instances, in order to avoid useless locking/unlocking in kernels built without "options RACCT".
|
#
223677 |
|
29-Jun-2011 |
alc |
Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages.
This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages.
Update all of the existing assertions on pmap_remove_all() to reflect this change.
Reviewed by: kib
|
#
220373 |
|
05-Apr-2011 |
trasz |
Add accounting for most of the memory-related resources.
Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)
|
#
219819 |
|
21-Mar-2011 |
jeff |
- Merge changes to the base system to support OFED. These include a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND, and other miscellaneous small features.
|
#
218304 |
|
04-Feb-2011 |
alc |
Since the last parameter to vm_object_shadow() is a vm_size_t and not a vm_pindex_t, it makes no sense for its callers to perform atop(). Let vm_object_shadow() do that instead.
|
#
218070 |
|
29-Jan-2011 |
alc |
Reenable the call to vm_map_simplify_entry() from vm_map_insert() for non- MAP_STACK_* entries. (See r71983 and r74235.)
In some cases, performing this call to vm_map_simplify_entry() halves the number of vm map entries used by the Sun JDK.
|
#
216335 |
|
09-Dec-2010 |
mlaier |
Fix a long standing (from the original 4.4BSD lite sources) race between vmspace_fork and vm_map_wire that would lead to "vm_fault_copy_wired: page missing" panics. While faulting in pages for a map entry that is being wired down, mark the containing map as busy. In vmspace_fork wait until the map is unbusy, before we try to copy the entries.
Reviewed by: kib MFC after: 5 days Sponsored by: Isilon Systems, Inc.
|
#
216128 |
|
02-Dec-2010 |
trasz |
Replace pointer to "struct uidinfo" with pointer to "struct ucred" in "struct vm_object". This is required to make it possible to account for per-jail swap usage.
Reviewed by: kib@ Tested by: pho@ Sponsored by: FreeBSD Foundation
|
#
215307 |
|
14-Nov-2010 |
kib |
Implement a (soft) stack guard page for auto-growing stack mappings. The unmapped page separates the tip of the stack and possible adjanced segment, making some uses of stack overflow harder. The stack growing code refuses to expand the segment to the last page of the reseved region when sysctl security.bsd.stack_guard_page is set to 1. The default value for sysctl and accompanying tunable is 0.
Please note that mmap(MAP_FIXED) still can place a mapping right up to the stack, making continuous region.
Reviewed by: alc MFC after: 1 week
|
#
214953 |
|
07-Nov-2010 |
alc |
In case the stack size reaches its limit and its growth must be restricted, ensure that grow_amount is a multiple of the page size. Otherwise, the kernel may crash in swap_reserve_by_uid() on HEAD and FreeBSD 8.x, and produce a core file with a missing stack on FreeBSD 7.x.
Diagnosed and reported by: jilles Reviewed by: kib MFC after: 1 week
|
#
214144 |
|
21-Oct-2010 |
jhb |
- Make 'vm_refcnt' volatile so that compilers won't be tempted to treat its value as a loop invariant. Currently this is a no-op because 'atomic_cmpset_int()' clobbers all memory on current architectures. - Use atomic_fetchadd_int() instead of an atomic_cmpset_int() loop to drop a reference in vmspace_free().
Reviewed by: alc MFC after: 1 month
|
#
213408 |
|
04-Oct-2010 |
alc |
If vm_map_find() is asked to allocate a superpage-aligned region of virtual addresses that is greater than a superpage in size but not a multiple of the superpage size, then vm_map_find() is not always expanding the kernel pmap to support the last few small pages being allocated. These failures are not commonplace, so this was first noticed by someone porting FreeBSD to a new architecture. Previously, we grew the kernel page table in vm_map_findspace() when we found the first available virtual address. This works most of the time because we always grow the kernel pmap or page table by an amount that is a multiple of the superpage size. Now, instead, we defer the call to pmap_growkernel() until we are committed to a range of virtual addresses in vm_map_insert(). In general, there is another reason to prefer calling pmap_growkernel() in vm_map_insert(). It makes it possible for someone to do the equivalent of an mmap(MAP_FIXED) on the kernel map.
Reported by: Svatopluk Kraus Reviewed by: kib@ MFC after: 3 weeks
|
#
212868 |
|
19-Sep-2010 |
alc |
Make refinements to r212824. In particular, don't make vm_map_unlock_nodefer() part of the synchronization interface for maps.
Add comments to vm_map_unlock_and_wait() and vm_map_wakeup() describing how they should be used. In particular, describe the deferred deallocations issue with vm_map_unlock_and_wait().
Redo the implementation of vm_map_unlock_and_wait() so that it passes along the caller's file and line information, just like the other map locking primitives.
Reviewed by: kib X-MFC after: r212824
|
#
212824 |
|
18-Sep-2010 |
kib |
Adopt the deferring of object deallocation for the deleted map entries on map unlock to the lock downgrade and later read unlock operation.
System map entries cannot be backed by OBJT_VNODE objects, no need to defer deallocation for them. Map entries from user maps do not require the owner map for deallocation, and can be accumulated in the thread-local list for freeing when a user map is unlocked.
Move the collection of entries for deferred reclamation into vm_map_delete(). Create helper vm_map_process_deferred(), that is called from locations where processing is feasible. Do not process deferred entries in vm_map_unlock_and_wait() since map_sleep_mtx is held.
Reviewed by: alc, rstone (previous versions) Tested by: pho MFC after: 2 weeks
|
#
209685 |
|
04-Jul-2010 |
kib |
Introduce a helper function vm_page_find_least(). Use it in several places, which inline the function.
Reviewed by: alc Tested by: pho MFC after: 1 week
|
#
208574 |
|
26-May-2010 |
alc |
Push down page queues lock acquisition in pmap_enter_object() and pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock.
Assert that the page is managed in pmap_is_referenced().
On powerpc/aim, push down the page queues lock acquisition from moea*_is_modified() and moea*_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock.
Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment.
Correct a spelling error in vm_page_dontneed().
Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable.
Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked.
Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty().
Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions.
Reviewed by: kib
|
#
207487 |
|
01-May-2010 |
alc |
Correct an error of omission in r206819. If VMFS_TLB_ALIGNED_SPACE is specified to vm_map_find(), then retry the vm_map_findspace() if vm_map_insert() fails because the aligned space is already partly used.
Reported by: Neel Natu
|
#
206819 |
|
18-Apr-2010 |
jmallett |
o) Add a VM find-space option, VMFS_TLB_ALIGNED_SPACE, which searches the address space for an address as aligned by the new pmap_align_tlb() function, which is for constraints imposed by the TLB. [1] o) Add a kmem_alloc_nofault_space() function, which acts like kmem_alloc_nofault() but allows the caller to specify which find-space option to use. [1] o) Use kmem_alloc_nofault_space() with VMFS_TLB_ALIGNED_SPACE to allocate the kernel stack address on MIPS. [1] o) Make pmap_align_tlb() on MIPS align addresses so that they do not start on an odd boundary within the TLB, so that they are suitable for insertion as wired entries and do not have to share a TLB entry with another mapping, assuming they are appropriately-sized. o) Eliminate md_realstack now that the kstack will be appropriately-aligned on MIPS. o) Increase the number of guard pages to 2 so that we retain the proper alignment of the kstack address.
Reviewed by: [1] alc X-MFC-after: Making sure alc has not come up with a better interface.
|
#
206142 |
|
03-Apr-2010 |
alc |
Make _vm_map_init() the one place where the vm map's pmap field is initialized.
Reviewed by: kib
|
#
206140 |
|
03-Apr-2010 |
alc |
Re-enable the call to pmap_release() by vmspace_dofree(). The accounting problem that is described in the comment has been addressed.
Submitted by: kib Tested by: pho (a few months ago) MFC after: 6 weeks
|
#
203175 |
|
29-Jan-2010 |
kib |
The MAP_ENTRY_NEEDS_COPY flag belongs to protoeflags, cow variable uses different namespace.
Reported by: Jonathan Anderson <jonathan.anderson cl cam ac uk> MFC after: 3 days
|
#
199819 |
|
26-Nov-2009 |
alc |
Replace VM_PROT_OVERRIDE_WRITE by VM_PROT_COPY. VM_PROT_OVERRIDE_WRITE has represented a write access that is allowed to override write protection. Until now, VM_PROT_OVERRIDE_WRITE has been used to write breakpoints into text pages. Text pages are not just write protected but they are also copy-on-write. VM_PROT_OVERRIDE_WRITE overrides the write protection on the text page and triggers the replication of the page so that the breakpoint will be written to a private copy. However, here is where things become confused. It is the debugger, not the process being debugged that requires write access to the copied page. Nonetheless, the copied page is being mapped into the process with write access enabled. In other words, once the debugger sets a breakpoint within a text page, the program can write to its private copy of that text page. Whereas prior to setting the breakpoint, a SIGSEGV would have occurred upon a write access. VM_PROT_COPY addresses this problem. The combination of VM_PROT_READ and VM_PROT_COPY forces the replication of a copy-on-write page even though the access is only for read. Moreover, the replicated page is only mapped into the process with read access, and not write access.
Reviewed by: kib MFC after: 4 weeks
|
#
199490 |
|
18-Nov-2009 |
alc |
Simplify both the invocation and the implementation of vm_fault() for wiring pages.
(Note: Claims made in the comments about the handling of breakpoints in wired pages have been false for roughly a decade. This and another bug involving breakpoints will be fixed in coming changes.)
Reviewed by: kib
|
#
198812 |
|
02-Nov-2009 |
alc |
Avoid pointless calls to pmap_protect().
Reviewed by: kib
|
#
198505 |
|
27-Oct-2009 |
kib |
When protection of wired read-only mapping is changed to read-write, install new shadow object behind the map entry and copy the pages from the underlying objects to it. This makes the mprotect(2) call to actually perform the requested operation instead of silently do nothing and return success, that causes SIGSEGV on later write access to the mapping.
Reuse vm_fault_copy_entry() to do the copying, modifying it to behave correctly when src_entry == dst_entry.
Reviewed by: alc MFC after: 3 weeks
|
#
197661 |
|
01-Oct-2009 |
kib |
Move the annotation for vm_map_startup() immediately before the function.
MFC after: 3 days
|
#
195840 |
|
24-Jul-2009 |
jhb |
Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver.
Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
|
#
195635 |
|
12-Jul-2009 |
kib |
When VM_MAP_WIRE_HOLESOK is not specified and vm_map_wire(9) encounters non-readable and non-executable map entry, the entry is skipped from wiring and loop is aborted. But, since MAP_ENTRY_WIRE_SKIPPED was not set for the map entry, its wired_count is later erronously decremented. vm_map_delete(9) for such map entry stuck in "vmmaps".
Properly set MAP_ENTRY_WIRE_SKIPPED when aborting the loop.
Reported by: John Marshall <john.marshall riverwillow com au> Approved by: re (kensmith)
|
#
195329 |
|
03-Jul-2009 |
kib |
When forking a vm space that has wired map entries, do not forget to charge the objects created by vm_fault_copy_entry. The object charge was set, but reserve not incremented.
Reported by: Greg Rivers <gcr+freebsd-current tharned org> Reviewed by: alc (previous version) Approved by: re (kensmith)
|
#
194766 |
|
23-Jun-2009 |
kib |
Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid.
The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup.
The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped.
The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4).
Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced.
In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
|
#
193842 |
|
09-Jun-2009 |
alc |
Eliminate an unnecessary restriction on the vm object type from vm_map_pmap_enter(). The immediate effect of this change is that automatic prefaulting by mmap() for small mappings is performed on POSIX shared memory objects just the same as it is on ordinary files.
|
#
193643 |
|
07-Jun-2009 |
alc |
Eliminate unnecessary obfuscation when testing a page's valid bits.
|
#
191256 |
|
18-Apr-2009 |
alc |
Allow valid pages to be mapped for read access when they have a non-zero busy count. Only mappings that allow write access should be prevented by a non-zero busy count.
(The prohibition on mapping pages for read access when they have a non- zero busy count originated in revision 1.202 of i386/i386/pmap.c when this code was a part of the pmap.)
Reviewed by: tegge
|
#
190886 |
|
10-Apr-2009 |
kib |
When vm_map_wire(9) is allowed to skip holes in the wired region, skip the mappings without any of read and execution rights, in particular, the PROT_NONE entries. This makes mlockall(2) work for the process address space that has such mappings.
Since protection mode of the entry may change between setting MAP_ENTRY_IN_TRANSITION and final pass over the region that records the wire status of the entries, allocate new map entry flag MAP_ENTRY_WIRE_SKIPPED to mark the skipped PROT_NONE entries.
Reported and tested by: Hans Ottevanger <fbsdhackers beasties demon nl> Reviewed by: alc MFC after: 3 weeks
|
#
189015 |
|
24-Feb-2009 |
kib |
Revert the addition of the freelist argument for the vm_map_delete() function, done in r188334. Instead, collect the entries that shall be freed, in the deferred_freelist member of the map. Automatically purge the deferred freelist when map is unlocked.
Tested by: pho Reviewed by: alc
|
#
189014 |
|
24-Feb-2009 |
kib |
Add the assertion macros for the map locks. Use them in several map manipulation functions.
Tested by: pho Reviewed by: alc
|
#
189012 |
|
24-Feb-2009 |
kib |
Update the comment after the r188334.
Reviewed by: alc
|
#
188335 |
|
08-Feb-2009 |
kib |
Improve comments, correct English.
Submitted by: alc
|
#
188334 |
|
08-Feb-2009 |
kib |
Do not call vm_object_deallocate() from vm_map_delete(), because we hold the map lock there, and might need the vnode lock for OBJT_VNODE objects. Postpone object deallocation until caller of vm_map_delete() drops the map lock. Link the map entries to be freed into the freelist, that is released by the new helper function vm_map_entry_free_freelist().
Reviewed by: tegge, alc Tested by: pho
|
#
188333 |
|
08-Feb-2009 |
kib |
In vm_map_sync(), do not call vm_object_sync() while holding map lock. Reference object, drop the map lock, and then call vm_object_sync(). The object sync might require vnode lock for OBJT_VNODE type objects.
Reviewed by: tegge Tested by: pho
|
#
188325 |
|
08-Feb-2009 |
kib |
Add the comments to vm_map_simplify_entry() and vmspace_fork(), describing why several calls to vm_deallocate_object() with locked map do not result in the acquisition of the vnode lock after map lock.
Suggested and reviewed by: tegge
|
#
188323 |
|
08-Feb-2009 |
kib |
Lock the new map in vmspace_fork(). The newly allocated map should not be accessible outside vmspace_fork() yet, but locking it would satisfy the protocol of the vm_map_entry_link() and other functions called from vmspace_fork().
Use trylock that is supposedly cannot fail, to silence WITNESS warning of the nested acquisition of the sx lock with the same name.
Suggested and reviewed by: tegge
|
#
188320 |
|
08-Feb-2009 |
kib |
Do not leak the MAP_ENTRY_IN_TRANSITION flag when copying map entry on fork. Otherwise, copied entry cannot be removed in the child map.
Reviewed by: tegge MFC after: 2 weeks
|
#
186665 |
|
31-Dec-2008 |
alc |
Resurrect shared map locks allowing greater concurrency during some map operations, such as page faults.
An earlier version of this change was ...
Reviewed by: kib Tested by: pho MFC after: 6 weeks
|
#
186633 |
|
31-Dec-2008 |
alc |
Update or eliminate some stale comments.
|
#
186618 |
|
30-Dec-2008 |
alc |
Avoid an unnecessary memory dereference in vm_map_entry_splay().
|
#
186616 |
|
30-Dec-2008 |
alc |
Style change to vm_map_lookup(): Eliminate a macro of dubious value.
|
#
186609 |
|
30-Dec-2008 |
alc |
Move the implementation of the vm map's fast path on address lookup from vm_map_lookup{,_locked}() to vm_map_lookup_entry(). Having the fast path in vm_map_lookup{,_locked}() limits its benefits to page faults. Moving it to vm_map_lookup_entry() extends its benefits to other operations on the vm map.
|
#
179921 |
|
21-Jun-2008 |
alc |
KERNBASE is not necessarily an address within the kernel map, e.g., PowerPC/AIM. Consequently, it should not be used to determine the maximum number of kernel map entries. Intead, use VM_MIN_KERNEL_ADDRESS, which marks the start of the kernel map on all architectures.
Tested by: marcel@ (PowerPC/AIM)
|
#
178928 |
|
10-May-2008 |
alc |
Generalize vm_map_find(9)'s parameter "find_space". Specifically, add support for VMFS_ALIGNED_SPACE, which requests the allocation of an address range best suited to superpages. The old options TRUE and FALSE are mapped to VMFS_ANY_SPACE and VMFS_NO_SPACE, so that there is no immediate need to update all of vm_map_find(9)'s callers.
While I'm here, correct a misstatement about vm_map_find(9)'s return values in the man page.
|
#
178630 |
|
28-Apr-2008 |
alc |
vm_map_fixed(), unlike vm_map_find(), does not update "addr", so it can be passed by value.
|
#
177922 |
|
04-Apr-2008 |
alc |
Update a comment to vm_map_pmap_enter().
|
#
177091 |
|
12-Mar-2008 |
jeff |
Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
|
#
175079 |
|
04-Jan-2008 |
kib |
In the vm_map_stack(), check for the specified stack region wraparound.
Reported and tested by: Peter Holm Reviewed by: alc MFC after: 3 days
|
#
173429 |
|
07-Nov-2007 |
pjd |
Change unused 'user_wait' argument to 'timo' argument, which will be used to specify timeout for msleep(9).
Discussed with: alc Reviewed by: alc
|
#
173361 |
|
05-Nov-2007 |
kib |
Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL.
As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done.
The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper).
In collaboration with: Peter Holm Reviewed by: jhb
|
#
172863 |
|
22-Oct-2007 |
alc |
Correct an error in vm_map_sync(), nee vm_map_clean(), that has existed since revision 1.1. Specifically, neither traversal of the vm map checks whether the end of the vm map has been reached. Consequently, the first traversal can wrap around and bogusly return an error.
This error has gone unnoticed for so long because no one had ever before tried msync(2)ing a region above the stack.
Reported by: peter MFC after: 1 week
|
#
172317 |
|
25-Sep-2007 |
alc |
Change the management of cached pages (PQ_CACHE) in two fundamental ways:
(1) Cached pages are no longer kept in the object's resident page splay tree and memq. Instead, they are kept in a separate per-object splay tree of cached pages. However, access to this new per-object splay tree is synchronized by the _free_ page queues lock, not to be confused with the heavily contended page queues lock. Consequently, a cached page can be reclaimed by vm_page_alloc(9) without acquiring the object's lock or the page queues lock.
This solves a problem independently reported by tegge@ and Isilon. Specifically, they observed the page daemon consuming a great deal of CPU time because of pages bouncing back and forth between the cache queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of this problem turned out to be a deadlock avoidance strategy employed when selecting a cached page to reclaim in vm_page_select_cache(). However, the root cause was really that reclaiming a cached page required the acquisition of an object lock while the page queues lock was already held. Thus, this change addresses the problem at its root, by eliminating the need to acquire the object's lock.
Moreover, keeping cached pages in the object's primary splay tree and memq was, in effect, optimizing for the uncommon case. Cached pages are reclaimed far, far more often than they are reactivated. Instead, this change makes reclamation cheaper, especially in terms of synchronization overhead, and reactivation more expensive, because reactivated pages will have to be reentered into the object's primary splay tree and memq.
(2) Cached pages are now stored alongside free pages in the physical memory allocator's buddy queues, increasing the likelihood that large allocations of contiguous physical memory (i.e., superpages) will succeed.
Finally, as a result of this change long-standing restrictions on when and where a cached page can be reclaimed and returned by vm_page_alloc(9) are eliminated. Specifically, calls to vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and return a formerly cached page. Consequently, a call to malloc(9) specifying M_NOWAIT is less likely to fail.
Discussed with: many over the course of the summer, including jeff@, Justin Husted @ Isilon, peter@, tegge@ Tested by: an earlier version by kris@ Approved by: re (kensmith)
|
#
171902 |
|
20-Aug-2007 |
kib |
Do not drop vm_map lock between doing vm_map_remove() and vm_map_insert(). For this, introduce vm_map_fixed() that does that for MAP_FIXED case.
Dropping the lock allowed for parallel thread to occupy the freed space.
Reported by: Tijl Coosemans <tijl ulyssis org> Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
|
#
170170 |
|
31-May-2007 |
attilio |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately.
Requested by: alc Approved by: jeff (mentor)
|
#
170149 |
|
31-May-2007 |
attilio |
Add functions sx_xlock_sig() and sx_slock_sig(). These functions are intended to do the same actions of sx_xlock() and sx_slock() but with the difference to perform an interruptible sleep, so that sleep can be interrupted by external events. In order to support these new featueres, some code renstruction is needed, but external API won't be affected at all.
Note: use "void" cast for "int" returning functions in order to avoid tools like Coverity prevents to whine.
Requested by: rwatson Tested by: rwatson Reviewed by: jhb Approved by: jeff (mentor)
|
#
169849 |
|
22-May-2007 |
alc |
Eliminate the reactivation of cached pages in vm_fault_prefault() and vm_map_pmap_enter() unless the caller is madvise(MADV_WILLNEED). With the exception of calls to vm_map_pmap_enter() from madvise(MADV_WILLNEED), vm_fault_prefault() and vm_map_pmap_enter() are both used to create speculative mappings. Thus, always reactivating cached pages is a mistake. In principle, cached pages should only be reactivated by an actual access. Otherwise, the following misbehavior can occur. On a hard fault for a text page the clustering algorithm fetches not only the required page but also several of the adjacent pages. Now, suppose that one or more of the adjacent pages are never accessed. Ultimately, these unused pages become cached pages through the efforts of the page daemon. However, the next activation of the executable reactivates and maps these unused pages. Consequently, they are never replaced. In effect, they become pinned in memory.
|
#
169667 |
|
18-May-2007 |
jeff |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
|
#
169048 |
|
26-Apr-2007 |
alc |
Remove some code from vmspace_fork() that became redundant after revision 1.334 modified _vm_map_init() to initialize the new vm map's flags to zero.
|
#
167880 |
|
25-Mar-2007 |
alc |
Two small changes to vm_map_pmap_enter():
1) Eliminate an unnecessary check for fictitious pages. Specifically, only device-backed objects contain fictitious pages and the object is not device-backed.
2) Change the types of "psize" and "tmpidx" to vm_pindex_t in order to prevent possible wrap around with extremely large maps and objects, respectively. Observed by: tegge (last summer)
|
#
166964 |
|
25-Feb-2007 |
alc |
Change the way that unmanaged pages are created. Specifically, immediately flag any page that is allocated to a OBJT_PHYS object as unmanaged in vm_page_alloc() rather than waiting for a later call to vm_page_unmanage(). This allows for the elimination of some uses of the page queues lock.
Change the type of the kernel and kmem objects from OBJT_DEFAULT to OBJT_PHYS. This allows us to take advantage of the above change to simplify the allocation of unmanaged pages in kmem_alloc() and kmem_malloc().
Remove vm_page_unmanage(). It is no longer used.
|
#
163594 |
|
21-Oct-2006 |
alc |
Eliminate unnecessary PG_BUSY tests. They originally served a purpose that is now handled by vm object locking.
|
#
160561 |
|
21-Jul-2006 |
alc |
Retire debug.mpsafevm. None of the architectures supported in CVS require it any longer.
|
#
159681 |
|
17-Jun-2006 |
alc |
Use ptoa(psize) instead of size to compute the end of the mapping in vm_map_pmap_enter().
|
#
159620 |
|
14-Jun-2006 |
alc |
Correct an error in the previous revision that could lead to a panic: Found mapped cache page. Specifically, if cnt.v_free_count dips below cnt.v_free_reserved after p_start has been set to a non-NULL value, then vm_map_pmap_enter() would break out of the loop and incorrectly call pmap_enter_object() for the remaining address range. To correct this error, this revision truncates the address range so that pmap_enter_object() will not map any cache pages.
In collaboration with: tegge@ Reported by: kris@
|
#
159303 |
|
05-Jun-2006 |
alc |
Introduce the function pmap_enter_object(). It maps a sequence of resident pages from the same object. Use it in vm_map_pmap_enter() to reduce the locking overhead of premapping objects.
Reviewed by: tegge@
|
#
159054 |
|
29-May-2006 |
tegge |
Close race between vmspace_exitfree() and exit1() and races between vmspace_exitfree() and vmspace_free() which could result in the same vmspace being freed twice.
Factor out part of exit1() into new function vmspace_exit(). Attach to vmspace0 to allow old vmspace to be freed earlier.
Add new function, vmspace_acquire_ref(), for obtaining a vmspace reference for a vmspace belonging to another process. Avoid changing vmspace refcount from 0 to 1 since that could also lead to the same vmspace being freed twice.
Change vmtotal() and swapout_procs() to use vmspace_acquire_ref().
Reviewed by: alc
|
#
156420 |
|
08-Mar-2006 |
imp |
Remove leading __ from __(inline|const|signed|volatile). They are obsolete. This should reduce diffs to NetBSD as well.
|
#
154889 |
|
27-Jan-2006 |
alc |
Use the new macros abstracting the page coloring/queues implementation. (There are no functional changes.)
|
#
153095 |
|
04-Dec-2005 |
alc |
Simplify vmspace_dofree().
|
#
153068 |
|
03-Dec-2005 |
alc |
Eliminate unneeded preallocation at initialization.
Reviewed by: tegge
|
#
152630 |
|
20-Nov-2005 |
alc |
Eliminate pmap_init2(). It's no longer used.
|
#
149768 |
|
03-Sep-2005 |
alc |
Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine whether the mapping should permit execute access.
|
#
148193 |
|
20-Jul-2005 |
alc |
Eliminate an incorrect (and unnecessary) cast.
|
#
145788 |
|
02-May-2005 |
alc |
Remove GIANT_REQUIRED from vmspace_exec().
Prodded by: jeff
|
#
140439 |
|
18-Jan-2005 |
alc |
Add checks to vm_map_findspace() to test for address wrap. The conditions where this could occur are very rare, but possible.
Submitted by: Mark W. Krentel MFC after: 2 weeks
|
#
139825 |
|
07-Jan-2005 |
imp |
/* -> /*- for license, minor formatting changes
|
#
139241 |
|
23-Dec-2004 |
alc |
Modify pmap_enter_quick() so that it expects the page queues to be locked on entry and it assumes the responsibility for releasing the page queues lock if it must sleep.
Remove a bogus comment from pmap_enter_quick().
Using the first change, modify vm_map_pmap_enter() so that the page queues lock is acquired and released once, rather than each time that a page is mapped.
|
#
138897 |
|
15-Dec-2004 |
alc |
In the common case, pmap_enter_quick() completes without sleeping. In such cases, the busying of the page and the unlocking of the containing object by vm_map_pmap_enter() and vm_fault_prefault() is unnecessary overhead. To eliminate this overhead, this change modifies pmap_enter_quick() so that it expects the object to be locked on entry and it assumes the responsibility for busying the page and unlocking the object if it must sleep. Note: alpha, amd64, i386 and ia64 are the only implementations optimized by this change; arm, powerpc, and sparc64 still conservatively busy the page and unlock the object within every pmap_enter_quick() call.
Additionally, this change is the first case where we synchronize access to the page's PG_BUSY flag and busy field using the containing object's lock rather than the global page queues lock. (Modifications to the page's PG_BUSY flag and busy field have asserted both locks for several weeks, enabling an incremental transition.)
|
#
134675 |
|
03-Sep-2004 |
alc |
Push Giant deep into vm_forkproc(), acquiring it only if the process has mapped System V shared memory segments (see shmfork_myhook()) or requires the allocation of an ldt (see vm_fault_wire()).
|
#
133807 |
|
16-Aug-2004 |
alc |
- Introduce and use a new tunable "debug.mpsafevm". At present, setting "debug.mpsafevm" results in (almost) Giant-free execution of zero-fill page faults. (Giant is held only briefly, just long enough to determine if there is a vnode backing the faulting address.)
Also, condition the acquisition and release of Giant around calls to pmap_remove() on "debug.mpsafevm".
The effect on performance is significant. On my dual Opteron, I see a 3.6% reduction in "buildworld" time.
- Use atomic operations to update several counters in vm_fault().
|
#
133796 |
|
16-Aug-2004 |
green |
Rather than bringing back all of the changes to make VM map deletion wait for system wires to disappear, do so (much more trivially) by instead only checking for system wires of user maps and not kernel maps.
Alternative by: tor Reviewed by: alc
|
#
133726 |
|
14-Aug-2004 |
alc |
Remove spl calls.
|
#
133636 |
|
13-Aug-2004 |
alc |
Replace the linear search in vm_map_findspace() with an O(log n) algorithm built into the map entry splay tree. This replaces the first_free hint in struct vm_map with two fields in vm_map_entry: adj_free, the amount of free space following a map entry, and max_free, the maximum amount of free space in the entry's subtree. These fields make it possible to find a first-fit free region of a given size in one pass down the tree, so O(log n) amortized using splay trees.
This significantly reduces the overhead in vm_map_findspace() for applications that mmap() many hundreds or thousands of regions, and has a negligible slowdown (0.1%) on buildworld. See, for example, the discussion of a micro-benchmark titled "Some mmap observations compared to Linux 2.6/OpenBSD" on -hackers in late October 2003.
OpenBSD adopted this approach in March 2002, and NetBSD added it in November 2003, both with Red-Black trees.
Submitted by: Mark W. Krentel
|
#
133598 |
|
12-Aug-2004 |
tegge |
The vm map lock is needed in vm_fault() after the page has been found, to avoid later changes before pmap_enter() and vm_fault_prefault() has completed.
Simplify deadlock avoidance by not blocking on vm map relookup.
In collaboration with: alc
|
#
133587 |
|
12-Aug-2004 |
green |
Re-delete the comment from r1.352.
|
#
133435 |
|
10-Aug-2004 |
green |
Back out all behavioral chnages.
|
#
133401 |
|
09-Aug-2004 |
green |
Revamp VM map wiring.
* Allow no-fault wiring/unwiring to succeed for consistency; however, the wired count remains at zero, so it's a special case.
* Fix issues inside vm_map_wire() and vm_map_unwire() where the exact state of user wiring (one or zero) and system wiring (zero or more) could be confused; for example, system unwiring could succeed in removing a user wire, instead of being an error.
* Require all mappings to be unwired before they are deleted. When VM space is still wired upon deletion, it will be waited upon for the following unwire. This makes vslock(9) work rather than allowing kernel-locked memory to be deleted out from underneath of its consumer as it would before.
|
#
133395 |
|
09-Aug-2004 |
alc |
Remove a stale comment from vm_map_lookup() that pertains to share maps. (The last vestiges of the share map code were removed in revisions 1.153 and 1.159.)
|
#
133143 |
|
04-Aug-2004 |
alc |
- Push down the acquisition and release of Giant into pmap_enter_quick() on those architectures without pmap locking. - Eliminate the acquisition and release of Giant in vm_map_pmap_enter().
|
#
132987 |
|
01-Aug-2004 |
green |
* Add a "how" argument to uma_zone constructors and initialization functions so that they know whether the allocation is supposed to be able to sleep or not. * Allow uma_zone constructors and initialation functions to return either success or error. Almost all of the ones in the tree currently return success unconditionally, but mbuf is a notable exception: the packet zone constructor wants to be able to fail if it cannot suballocate an mbuf cluster, and the mbuf allocators want to be able to fail in general in a MAC kernel if the MAC mbuf initializer fails. This fixes the panics people are seeing when they run out of memory for mbuf clusters. * Allow debug.nosleepwithlocks on WITNESS to be disabled, without changing the default.
Both bmilekic and jeff have reviewed the changes made to make failable zone allocations work.
|
#
132899 |
|
30-Jul-2004 |
alc |
- Push down the acquisition and release of Giant into pmap_protect() on those architectures without pmap locking. - Eliminate the acquisition and release of Giant from vm_map_protect().
(Translation: mprotect(2) runs to completion without touching Giant on alpha, amd64, i386 and ia64.)
|
#
132880 |
|
30-Jul-2004 |
mux |
Get rid of another lockmgr(9) consumer by using sx locks for the user maps. We always acquire the sx lock exclusively here, but we can't use a mutex because we want to be able to sleep while holding the lock. This is completely equivalent to what we were doing with the lockmgr(9) locks before.
Approved by: alc
|
#
132684 |
|
27-Jul-2004 |
alc |
- Use atomic ops for updating the vmspace's refcnt and exitingcnt. - Push down Giant into shmexit(). (Giant is acquired only if the vmspace contains shm segments.) - Eliminate the acquisition of Giant from proc_rwmem(). - Reduce the scope of Giant in exit1(), uncovering the destruction of the address space.
|
#
132627 |
|
25-Jul-2004 |
alc |
Make the code and comments for vm_object_coalesce() consistent.
|
#
132593 |
|
24-Jul-2004 |
alc |
Simplify vmspace initialization. The bcopy() of fields from the old vmspace to the new vmspace in vmspace_exec() is mostly wasted effort. With one exception, vm_swrss, the copied fields are immediately overwritten. Instead, initialize these fields to zero in vmspace_alloc(), eliminating a bcopy() from vmspace_exec() and a bzero() from vmspace_fork().
|
#
132483 |
|
21-Jul-2004 |
peter |
Semi-gratuitous change. Move two refcount operations to their own lines rather than be buried inside an if (expression). And now that the if expression is the same in both exit paths, use the same ordering.
|
#
132475 |
|
20-Jul-2004 |
peter |
Move the initialization and teardown of pmaps to the vmspace zone's init and fini handlers. Our vm system removes all userland mappings at exit prior to calling pmap_release. It just so happens that we might as well reuse the pmap for the next process since the userland slate has already been wiped clean.
However. There is a functional benefit to this as well. For platforms that share userland and kernel context in the same pmap, it means that the kernel portion of a pmap remains valid after the vmspace has been freed (process exit) and while it is in uma's cache. This is significant for i386 SMP systems with kernel context borrowing because it avoids a LOT of IPIs from the pmap_lazyfix() cleanup in the usual case.
Tested on: amd64, i386, sparc64, alpha Glanced at by: alc
|
#
132220 |
|
15-Jul-2004 |
alc |
Push down the acquisition and release of the page queues lock into pmap_protect() and pmap_remove(). In general, they require the lock in order to modify a page's pv list or flags. In some cases, however, pmap_protect() can avoid acquiring the lock.
|
#
131252 |
|
28-Jun-2004 |
gallatin |
Use MIN() macro rather than ulmin() inline, and fix stray tab that snuck in with my last commit.
Submitted by: green
|
#
131251 |
|
28-Jun-2004 |
gallatin |
Fix alpha - the use of min() on longs was loosing the high bits and returning wrong answers, leading to strange values vm2->vm_{s,t,d}size.
|
#
131073 |
|
24-Jun-2004 |
green |
Correct the tracking of various bits of the process's vmspace and vm_map when not propogated on fork (due to minherit(2)). Consistency checks otherwise fail when the vm_map is freed and it appears to have not been emptied completely, causing an INVARIANTS panic in vm_map_zdtor().
PR: kern/68017 Submitted by: Mark W. Krentel <krentel@dreamscape.com> Reviewed by: alc
|
#
129728 |
|
25-May-2004 |
des |
Back out previous commit; it went to the wrong file.
|
#
129725 |
|
25-May-2004 |
des |
MFS: rev 1.187.2.27 through 1.187.2.29, fix MS_INVALIDATE semantics but provide a sysctl knob for reverting to old ones.
|
#
129701 |
|
25-May-2004 |
alc |
Correct two error cases in vm_map_unwire():
1. Contrary to the Single Unix Specification our implementation of munlock(2) when performed on an unwired virtual address range has returned an error. Correct this. Note, however, that the behavior of "system" unwiring is unchanged, only "user" unwiring is changed. If "system" unwiring is performed on an unwired virtual address range, an error is still returned.
2. Performing an errant "system" unwiring on a virtual address range that was "user" (i.e., mlock(2)) but not "system" wired would incorrectly undo the "user" wiring instead of returning an error. Correct this.
Discussed with: green@ Reviewed by: tegge@
|
#
129571 |
|
22-May-2004 |
alc |
To date, unwiring a fictitious page has produced a panic. The reason being that PHYS_TO_VM_PAGE() returns the wrong vm_page for fictitious pages but unwiring uses PHYS_TO_VM_PAGE(). The resulting panic reported an unexpected wired count. Rather than attempting to fix PHYS_TO_VM_PAGE(), this fix takes advantage of the properties of fictitious pages. Specifically, fictitious pages will never be completely unwired. Therefore, we can keep a fictitious page's wired count forever set to one and thereby avoid the use of PHYS_TO_VM_PAGE() when we know that we're working with a fictitious page, just not which one.
In collaboration with: green@, tegge@ PR: kern/29915
|
#
129018 |
|
06-May-2004 |
green |
Properly remove MAP_FUTUREWIRE when a vm_map_entry gets torn down. Previously, mlockall(2) usage would leak MAP_FUTUREWIRE of the process's vmspace::vm_map and subsequent processes would wire all of their memory. Coupled with a wired-page leak in vm_fault_unwire(), this would run the system out of free pages and cause programs to randomly SIGBUS when faulting in new pages.
(Note that this is not the fix for the latter part; pages are still leaked when a wired area is unmapped in some cases.)
Reviewed by: alc PR kern/62930
|
#
128596 |
|
24-Apr-2004 |
alc |
In cases where a file was resident in memory mmap(..., PROT_NONE, ...) would actually map the file with read access enabled. According to http://www.opengroup.org/onlinepubs/007904975/functions/mmap.html this is an error. Similarly, an madvise(..., MADV_WILLNEED) would enable read access on a virtual address range that was PROT_NONE.
The solution implemented herein is (1) to pass a vm_prot_t to vm_map_pmap_enter() describing the allowed access and (2) to make vm_map_pmap_enter() responsible for understanding the limitations of pmap_enter_quick().
Submitted by: "Mark W. Krentel" <krentel@dreamscape.com> PR: kern/64573
|
#
127961 |
|
06-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999.
Approved by: core
|
#
127327 |
|
23-Mar-2004 |
tjr |
Do not copy vm_exitingcnt to the new vmspace in vmspace_exec(). Copying it led to impossibly high values in the new vmspace, causing it to never drop to 0 and be freed.
|
#
126728 |
|
07-Mar-2004 |
alc |
Retire pmap_pinit2(). Alpha was the last platform that used it. However, ever since alpha/alpha/pmap.c revision 1.81 introduced the list allpmaps, there has been no reason for having this function on Alpha. Briefly, when pmap_growkernel() relied upon the list of all processes to find and update the various pmaps to reflect a growth in the kernel's valid address space, pmap_init2() served to avoid a race between pmap initialization and pmap_growkernel(). Specifically, pmap_pinit2() was responsible for initializing the kernel portions of the pmap and pmap_pinit2() was called after the process structure contained a pointer to the new pmap for use by pmap_growkernel(). Thus, an update to the kernel's address space might be applied to the new pmap unnecessarily, but an update would never be lost.
|
#
125748 |
|
12-Feb-2004 |
alc |
Further reduce the use of Giant in vm_map_delete(): Perform pmap_remove() on system maps, besides the kmem_map, without Giant.
In collaboration with: tegge
|
#
125470 |
|
05-Feb-2004 |
alc |
- Locking for the per-process resource limits structure has eliminated the need for Giant in vm_map_growstack(). - Use the proc * that is passed to vm_map_growstack() rather than curthread->td_proc.
|
#
125454 |
|
04-Feb-2004 |
jhb |
Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists.
Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
|
#
125362 |
|
02-Feb-2004 |
jhb |
Drop the reference count on the old vmspace after fully switching the current thread to the new vmspace.
Suggested by: dillon
|
#
124008 |
|
30-Dec-2003 |
alc |
- Modify vm_object_split() to expect a locked vm object on entry and return on a locked vm object on exit. Remove GIANT_REQUIRED. - Eliminate some unnecessary local variables from vm_object_split().
|
#
123878 |
|
26-Dec-2003 |
alc |
Minor correction to revision 1.258: Use the proc pointer that is passed to vm_map_growstack() in the RLIMIT_VMEM check rather than curthread.
|
#
122902 |
|
19-Nov-2003 |
alc |
- Avoid a lock-order reversal between Giant and a system map mutex that occurs when kmem_malloc() fails to allocate a sufficient number of vm pages. Specifically, we avoid the lock-order reversal by not grabbing Giant around pmap_remove() if the map is the kmem_map.
Approved by: re (jhb) Reported by: Eugene <eugene3@web.de>
|
#
122646 |
|
14-Nov-2003 |
alc |
Changes to msync(2) - Return EBUSY if the region was wired by mlock(2) and MS_INVALIDATE is specified to msync(2). This is required by the Open Group Base Specifications Issue 6. - vm_map_sync() doesn't return KERN_FAILURE. Thus, msync(2) can't possibly return EIO. - The second major loop in vm_map_sync() handles sub maps. Thus, failing on sub maps in the first major loop isn't necessary.
|
#
122384 |
|
09-Nov-2003 |
alc |
- The Open Group Base Specifications Issue 6 specifies that an munmap(2) must return EINVAL if size is zero. Submitted by: tegge - In order to avoid a race condition in multithreaded applications, the check and removal operations by munmap(2) must be in the same critical section. To accomodate this, vm_map_check_protection() is modified to require its caller to obtain at least a read lock on the map.
|
#
122367 |
|
09-Nov-2003 |
alc |
- Remove Giant from msync(2). Giant is still acquired by the lower layers if we drop into the pmap or vnode layers. - Migrate the handling of zero-length msync(2)s into vm_map_sync() so that multithread applications can't change the map between implementing the zero-length hack in msync(2) and reacquiring the map lock in vm_map_sync().
Reviewed by: tegge
|
#
122349 |
|
09-Nov-2003 |
alc |
- Rename vm_map_clean() to vm_map_sync(). This better reflects the fact that msync(2) is its only caller. - Migrate the parts of the old vm_map_clean() that examined the internals of a vm object to a new function vm_object_sync() that is implemented in vm_object.c. At the same, introduce the necessary vm object locking so that vm_map_sync() and vm_object_sync() can be called without Giant.
Reviewed by: tegge
|
#
122095 |
|
05-Nov-2003 |
alc |
- Move the implementation of OBJ_ONEMAPPING from vm_map_delete() to vm_map_entry_delete() so that all of the vm object manipulation is performed in one place.
|
#
122034 |
|
04-Nov-2003 |
marcel |
Update avail_ssize for rstacks after growing them.
|
#
121962 |
|
03-Nov-2003 |
des |
Whitespace cleanup.
|
#
121919 |
|
02-Nov-2003 |
alc |
- Increase the scope of the source object lock in vm_map_copy_entry().
|
#
121907 |
|
02-Nov-2003 |
alc |
- Introduce and use vm_object_reference_locked(). Unlike vm_object_reference(), this function must not be used to reanimate dead vm objects. This restriction simplifies locking.
Reviewed by: tegge
|
#
121786 |
|
31-Oct-2003 |
marcel |
Fix two bugs introduced with the rstack functionality and specific to the rstack functionality: 1. Fix a KASSERT that tests for the address to be above the upward growable stack. Typically for rstack, the faulting address can be identical to the record end of the upward growable entry, and very likely is on ia64. The KASSERT tested for greater than, not greater equal, so whenever the register stack had to be grown the assertion fired. 2. When we grow the upward growable stack entry and adjust the unlying object, don't forget to adjust the size of the VM map. Not doing so would trigger an assert in vm_mapzdtor().
Pointy hat: marcel (for not testing with INVARIANTS).
|
#
121221 |
|
18-Oct-2003 |
alc |
Corrections to revision 1.305 - Specifying VM_MAP_WIRE_HOLESOK should not assume that the start address is the beginning of the map. Instead, move to the first entry after the start address. - The implementation of VM_MAP_WIRE_HOLESOK was incomplete. This caused the failure of mlockall(2) in some circumstances.
|
#
120831 |
|
05-Oct-2003 |
bms |
Move pmap_resident_count() from the MD pmap.h to the MI pmap.h. Add a definition of pmap_wired_count(). Add a definition of vmspace_wired_count().
Reviewed by: truckman Discussed with: peter
|
#
120531 |
|
27-Sep-2003 |
marcel |
Part 2 of implementing rstacks: add the ability to create rstacks and use the ability on ia64 to map the register stack. The orientation of the stack (i.e. its grow direction) is passed to vm_map_stack() in the overloaded cow argument. Since the grow direction is represented by bits, it is possible and allowed to create bi-directional stacks. This is not an advertised feature, more of a side-effect.
Fix a bug in vm_map_growstack() that's specific to rstacks and which we could only find by having the ability to create rstacks: when the mapped stack ends at the faulting address, we have not actually mapped the faulting address. we need to include or cover the faulting address.
Note that at this time mmap(2) has not been extended to allow the creation of rstacks by processes. If such a need arises, this can be done.
Tested on: alpha, i386, ia64, sparc64
|
#
120389 |
|
23-Sep-2003 |
silby |
Adjust the kmapentzone limit so that it takes into account the size of maxproc and maxfiles, as procs, pipes, and other structures cause allocations from kmapentzone.
Submitted by: tegge
|
#
120371 |
|
23-Sep-2003 |
alc |
Change the handling of the kernel and kmem objects in vm_map_delete(): In order to use "unmanaged" pages in the kmem object, vm_map_delete() must unconditionally perform pmap_remove(). Otherwise, sparc64 has problems.
Tested by: jake
|
#
119595 |
|
30-Aug-2003 |
marcel |
Introduce MAP_ENTRY_GROWS_DOWN and MAP_ENTRY_GROWS_UP to allow for growable (stack) entries that not only grow down, but also grow up. Have vm_map_growstack() take these flags into account when growing an entry.
This is the first step in adding support for upward growable stacks. It is a required feature on ia64 to support the register stack (or rstack as I like to call it -- it also means reverse stack). We do not currently create rstacks, so the upward growing is not exercised and the change should be a functional no-op.
Reviewed by: alc
|
#
118878 |
|
13-Aug-2003 |
alc |
Remove GIANT_REQUIRED from vmspace_alloc().
|
#
118771 |
|
11-Aug-2003 |
bms |
Add the mlockall() and munlockall() system calls. - All those diffs to syscalls.master for each architecture *are* necessary. This needed clarification; the stub code generation for mlockall() was disabled, which would prevent applications from linking to this API (suggested by mux) - Giant has been quoshed. It is no longer held by the code, as the required locking has been pushed down within vm_map.c. - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES to express their intention explicitly. - Inspected at the vmstat, top and vm pager sysctl stats level. Paging-in activity is occurring correctly, using a test harness. - The RES size for a process may appear to be greater than its SIZE. This is believed to be due to mappings of the same shared library page being wired twice. Further exploration is needed. - Believed to back out of allocations and locks correctly (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC).
PR: kern/43426, standards/54223 Reviewed by: jake, alc Approved by: jake (mentor) MFC after: 2 weeks
|
#
117724 |
|
18-Jul-2003 |
phk |
Move the implementation of the vmspace_swap_count() (used only in the "toss the largest process" emergency handling) from vm_map.c to swap_pager.c.
The quantity calculated depends strongly on the internals of the swap_pager and by moving it, we no longer need to expose the internal metrics of the swap_pager to the world.
|
#
117206 |
|
03-Jul-2003 |
alc |
Background: pmap_object_init_pt() premaps the pages of a object in order to avoid the overhead of later page faults. In general, it implements two cases: one for vnode-backed objects and one for device-backed objects. Only the device-backed case is really machine-dependent, belonging in the pmap.
This commit moves the vnode-backed case into the (relatively) new function vm_map_pmap_enter(). On amd64 and i386, this commit only amounts to code rearrangement. On alpha and ia64, the new machine independent (MI) implementation of the vnode case is smaller and more efficient than their pmap-based implementations. (The MI implementation takes advantage of the fact that objects in -CURRENT are ordered collections of pages.) On sparc64, pmap_object_init_pt() hadn't (yet) been implemented.
|
#
117093 |
|
01-Jul-2003 |
alc |
Check the address provided to vm_map_stack() against the vm map's maximum, returning an error if the address is too high.
|
#
117047 |
|
29-Jun-2003 |
alc |
Introduce vm_map_pmap_enter(). Presently, this is a stub calling the MD pmap_object_init_pt().
|
#
116923 |
|
27-Jun-2003 |
alc |
Simple read-modify-write operations on a vm object's flags, ref_count, and shadow_count can now rely on its mutex for synchronization. Remove one use of Giant from vm_map_insert().
|
#
116799 |
|
25-Jun-2003 |
alc |
Remove a GIANT_REQUIRED on the kernel object that we no longer need.
|
#
116226 |
|
11-Jun-2003 |
obrien |
Use __FBSDID().
|
#
115931 |
|
07-Jun-2003 |
alc |
Pass the vm object to vm_object_collapse() with its lock held.
|
#
114317 |
|
30-Apr-2003 |
alc |
Increase the scope of the vm_object lock in vm_map_delete().
|
#
114263 |
|
29-Apr-2003 |
alc |
Add vm_object locking to vmspace_swap_count().
|
#
114053 |
|
26-Apr-2003 |
alc |
- Extend the scope of two existing vm_object locks to cover swap_pager_freespace().
|
#
113955 |
|
24-Apr-2003 |
alc |
- Acquire the vm_object's lock when performing vm_object_page_clean(). - Add a parameter to vm_pageout_flush() that tells vm_pageout_flush() whether its caller has locked the vm_object. (This is a temporary measure to bootstrap vm_object locking.)
|
#
113768 |
|
20-Apr-2003 |
alc |
- Update the vm_object locking in vm_map_insert().
|
#
113740 |
|
20-Apr-2003 |
alc |
Update vm_object locking in vm_map_delete().
|
#
113701 |
|
18-Apr-2003 |
alc |
o Update locking around vm_object_page_remove() in vm_map_clean() to use the new macros. o Remove unnecessary increment and decrement of the vm_object's reference count in vm_map_clean().
|
#
113449 |
|
13-Apr-2003 |
alc |
Lock some manipulations of the vm object's flags.
|
#
112367 |
|
18-Mar-2003 |
phk |
Including <sys/stdint.h> is (almost?) universally only to be able to use %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
|
#
112167 |
|
12-Mar-2003 |
das |
- When the VM daemon is out of swap space and looking for a process to kill, don't block on a map lock while holding the process lock. Instead, skip processes whose map locks are held and find something else to kill. - Add vm_map_trylock_read() to support the above.
Reviewed by: alc, mike (mentor)
|
#
111937 |
|
06-Mar-2003 |
alc |
Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on: arch@
|
#
111119 |
|
19-Feb-2003 |
imp |
Back out M_* changes, per decision of the TRB.
Approved by: trb
|
#
110958 |
|
15-Feb-2003 |
alc |
Remove the acquisition and release of Giant around pmap_growkernel(). It's unnecessary for two reasons: (1) Giant is at present already held in such cases and (2) our various implementations of pmap_growkernel() look to be MP safe. (For example, for sparc64 the proof of (2) is trivial.)
|
#
109820 |
|
25-Jan-2003 |
alc |
Add MTX_DUPOK to the initialization of system map locks.
|
#
109623 |
|
21-Jan-2003 |
alfred |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
#
109572 |
|
20-Jan-2003 |
dillon |
Close the remaining user address mapping races for physical I/O, CAM, and AIO. Still TODO: streamline useracc() checks.
Reviewed by: alc, tegge MFC after: 7 days
|
#
109205 |
|
13-Jan-2003 |
dillon |
It is possible for an active aio to prevent shared memory from being dereferenced when a process exits due to the vmspace ref-count being bumped. Change shmexit() and shmexit_myhook() to take a vmspace instead of a process and call it in vmspace_dofree(). This way if it is missed in exit1()'s early-resource-free it will still be caught when the zombie is reaped.
Also fix a potential race in shmexit_myhook() by NULLing out vmspace->vm_shm prior to calling shm_delete_mapping() and free().
MFC after: 7 days
|
#
108594 |
|
03-Jan-2003 |
alc |
Lock the vm object when performing vm_object_clear_flag().
|
#
108515 |
|
31-Dec-2002 |
alc |
Implement a variant locking scheme for vm maps: Access to system maps is now synchronized by a mutex, whereas access to user maps is still synchronized by a lockmgr()-based lock. Why? No single type of lock, including sx locks, meets the requirements of both types of vm map. Sometimes we sleep while holding the lock on a user map. Thus, a a mutex isn't appropriate. On the other hand, both lockmgr()-based and sx locks release Giant when a thread/process blocks during contention for a lock. This could lead to a race condition in a legacy driver (that relies on Giant for synchronization) if it attempts to kmem_malloc() and fails to immediately obtain the lock. Fortunately, we never sleep while holding a system map lock.
|
#
108418 |
|
29-Dec-2002 |
alc |
- Increment the vm_map's timestamp if _vm_map_trylock() succeeds. - Introduce map_sleep_mtx and use it to replace Giant in vm_map_unlock_and_wait() and vm_map_wakeup(). (Original version by: tegge.)
|
#
108413 |
|
29-Dec-2002 |
alc |
- Remove vm_object_init2(). It is unused. - Add a mtx_destroy() to vm_object_collapse(). (This allows a bzero() to migrate from _vm_object_allocate() to vm_object_zinit(), where it will be performed less often.)
|
#
107912 |
|
15-Dec-2002 |
dillon |
Fix a refcount race with the vmspace structure. In order to prevent resource starvation we clean-up as much of the vmspace structure as we can when the last process using it exits. The rest of the structure is cleaned up when it is reaped. But since exit1() decrements the ref count it is possible for a double-free to occur if someone else, such as the process swapout code, references and then dereferences the structure. Additionally, the final cleanup of the structure should not occur until the last process referencing it is reaped.
This commit solves the problem by introducing a secondary reference count, calling 'vm_exitingcnt'. The normal reference count is decremented on exit and vm_exitingcnt is incremented. vm_exitingcnt is decremented when the process is reaped. When both vm_exitingcnt and vm_refcnt are 0, the structure is freed for real.
MFC after: 3 weeks
|
#
107892 |
|
15-Dec-2002 |
alc |
Perform vm_object_lock() and vm_object_unlock() around vm_object_page_remove().
|
#
107464 |
|
01-Dec-2002 |
alc |
Hold the page queues lock when calling pmap_protect(); it updates fields of the vm_page structure. Make the style of the pmap_protect() calls consistent.
Approved by: re (blanket)
|
#
107250 |
|
25-Nov-2002 |
alc |
Acquire and release the page queues lock around calls to pmap_protect() because it updates flags within the vm page.
Approved by: re (blanket)
|
#
106708 |
|
09-Nov-2002 |
alc |
Fix an error case in vm_map_wire(): unwiring of an entry during cleanup after a user wire error fails when the entry is already system wired.
Reported by: tegge
|
#
106600 |
|
07-Nov-2002 |
mux |
Correctly print vm_offset_t types.
|
#
105229 |
|
16-Oct-2002 |
phk |
Properly put macro args in ().
Spotted by: FlexeLint.
|
#
103794 |
|
22-Sep-2002 |
mdodd |
Modify vm_map_clean() (and thus the msync(2) system call) to support invalidation of cached pages for objects of type OBJT_DEVICE.
Submitted by: Christian Zander <zander@minion.de> Approved by: alc
|
#
103767 |
|
21-Sep-2002 |
jake |
Use the fields in the sysentvec and in the vm map header in place of the constants VM_MIN_ADDRESS, VM_MAXUSER_ADDRESS, USRSTACK and PS_STRINGS. This is mainly so that they can be variable even for the native abi, based on different machine types. Get stack protections from the sysentvec too. This makes it trivial to map the stack non-executable for certain abis, on machines that support it.
|
#
102370 |
|
24-Aug-2002 |
alc |
o Use vm_object_lock() in place of Giant when manipulating a vm object in vm_map_insert().
|
#
100630 |
|
24-Jul-2002 |
alc |
o Merge vm_fault_wire() and vm_fault_user_wire() by adding a new parameter, user_wire.
|
#
100384 |
|
20-Jul-2002 |
peter |
Infrastructure tweaks to allow having both an Elf32 and an Elf64 executable handler in the kernel at the same time. Also, allow for the exec_new_vmspace() code to build a different sized vmspace depending on the executable environment. This is a big help for execing i386 binaries on ia64. The ELF exec code grows the ability to map partial pages when there is a page size difference, eg: emulating 4K pages on 8K or 16K hardware pages.
Flesh out the i386 emulation support for ia64. At this point, the only binary that I know of that fails is cvsup, because the cvsup runtime tries to execute code in pages not marked executable.
Obtained from: dfr (mostly, many tweaks from me).
|
#
100309 |
|
18-Jul-2002 |
peter |
(VM_MAX_KERNEL_ADDRESS - KERNBASE) / PAGE_SIZE may not fit in an integer. Use lmin(long, long), not min(u_int, u_int). This is a problem here on ia64 which has *way* more than 2^32 pages of KVA. 281474976710655 pages to be precice.
|
#
99893 |
|
12-Jul-2002 |
alc |
o Assert GIANT_REQUIRED on system maps in _vm_map_lock(), _vm_map_lock_read(), and _vm_map_trylock(). Submitted by: tegge o Remove GIANT_REQUIRED from kmem_alloc_wait() and kmem_free_wakeup(). (This clears the way for exec_map accesses to move outside of Giant. The exec_map is not a system map.) o Remove some premature MPSAFE comments.
Reviewed by: tegge
|
#
99754 |
|
11-Jul-2002 |
alc |
o Add a "needs wakeup" flag to the vm_map for use by kmem_alloc_wait() and kmem_free_wakeup(). Previously, kmem_free_wakeup() always called wakeup(). In general, no one was sleeping. o Export vm_map_unlock_and_wait() and vm_map_wakeup() from vm_map.c for use in vm_kern.c.
|
#
99374 |
|
03-Jul-2002 |
alc |
o Make the reservation of KVA space for kernel map entries a function of the KVA space's size in addition to the amount of physical memory and reduce it by a factor of two.
Under the old formula, our reservation amounted to one kernel map entry per virtual page in the KVA space on a 4GB i386.
|
#
98892 |
|
26-Jun-2002 |
iedowse |
Avoid using the 64-bit vm_pindex_t in a few places where 64-bit types are not required, as the overhead is unnecessary:
o In the i386 pmap_protect(), `sindex' and `eindex' represent page indices within the 32-bit virtual address space. o In swp_pager_meta_build() and swp_pager_meta_ctl(), use a temporary variable to store the low few bits of a vm_pindex_t that gets used as an array index. o vm_uiomove() uses `osize' and `idx' for page offsets within a map entry. o In vm_object_split(), `idx' is a page offset within a map entry.
|
#
98848 |
|
26-Jun-2002 |
dillon |
Enforce RLIMIT_VMEM on growable mappings (aka the primary stack or any MAP_STACK mapping).
Suggested by: alc
|
#
98624 |
|
22-Jun-2002 |
alc |
o In vm_map_insert(), replace GIANT_REQUIRED by the acquisition and release of Giant around the direct manipulation of the vm_object and the optional call to pmap_object_init_pt(). o In vm_map_findspace(), remove GIANT_REQUIRED. Instead, acquire and release Giant around the occasional call to pmap_growkernel(). o In vm_map_find(), remove GIANT_REQUIRED.
|
#
98541 |
|
21-Jun-2002 |
alc |
o Remove GIANT_REQUIRED from vm_map_stack().
|
#
98414 |
|
19-Jun-2002 |
alc |
o Replace GIANT_REQUIRED in vm_object_coalesce() by the acquisition and release of Giant. o Reduce the scope of GIANT_REQUIRED in vm_map_insert().
These changes will enable us to remove the acquisition and release of Giant from obreak().
|
#
98397 |
|
18-Jun-2002 |
alc |
o Remove LK_CANRECURSE from the vm_map lock.
|
#
98361 |
|
17-Jun-2002 |
jeff |
- Introduce the new M_NOVM option which tells uma to only check the currently allocated slabs and bucket caches for free items. It will not go ask the vm for pages. This differs from M_NOWAIT in that it not only doesn't block, it doesn't even ask.
- Add a new zcreate option ZONE_VM, that sets the BUCKETCACHE zflag. This tells uma that it should only allocate buckets out of the bucket cache, and not from the VM. It does this by using the M_NOVM option to zalloc when getting a new bucket. This is so that the VM doesn't recursively enter itself while trying to allocate buckets for vm_map_entry zones. If there are already allocated buckets when we get here we'll still use them but otherwise we'll skip it.
- Use the ZONE_VM flag on vm map entries and pv entries on x86.
|
#
98343 |
|
17-Jun-2002 |
alc |
o Acquire and release Giant in vm_map_wakeup() to prevent a lost wakeup().
Reviewed by: tegge
|
#
98226 |
|
14-Jun-2002 |
alc |
o Use vm_map_wire() and vm_map_unwire() in place of vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_clear_recursive() and vm_map_set_recursive(). (They were only used by vm_map_pageable() and vm_map_user_pageable().)
Reviewed by: tegge
|
#
98142 |
|
12-Jun-2002 |
alc |
o Acquire and release Giant in vm_map_unlock_and_wait().
Submitted by: tegge
|
#
98119 |
|
11-Jun-2002 |
alc |
o Properly handle a failure by vm_fault_wire() or vm_fault_user_wire() in vm_map_wire(). o Make two white-space changes in vm_map_wire().
Reviewed by: tegge
|
#
98109 |
|
11-Jun-2002 |
alc |
o Teach vm_map_delete() to respect the "in-transition" flag on a vm_map_entry by sleeping until the flag is cleared.
Submitted by: tegge
|
#
98083 |
|
10-Jun-2002 |
alc |
o In vm_map_entry_create(), call uma_zalloc() with M_NOWAIT on system maps. Submitted by: tegge o Eliminate the "!mapentzone" check from vm_map_entry_create() and vm_map_entry_dispose(). Reviewed by: tegge o Fix white-space usage in vm_map_entry_create().
|
#
98071 |
|
09-Jun-2002 |
alc |
o Add vm_map_wire() for wiring contiguous regions of either kernel or user vm_maps. This implementation has two key benefits when compared to vm_map_{user_,}pageable(): (1) it avoids a race condition through the use of "in-transition" vm_map entries and (2) it eliminates lock recursion on the vm_map.
Note: there is still an error case that requires clean up.
Reviewed by: tegge
|
#
98052 |
|
08-Jun-2002 |
alc |
o Simplify vm_map_unwire() by merging the second and third passes over the caller-specified region.
|
#
98036 |
|
08-Jun-2002 |
alc |
o Remove an unnecessary call to vm_map_wakeup() from vm_map_unwire(). o Add a stub for vm_map_wire().
Note: the description of the previous commit had an error. The in- transition flag actually blocks the deallocation of a vm_map_entry by vm_map_delete() and vm_map_simplify_entry().
|
#
98022 |
|
07-Jun-2002 |
alc |
o Add vm_map_unwire() for unwiring contiguous regions of either kernel or user vm_maps. In accordance with the standards for munlock(2), and in contrast to vm_map_user_pageable(), this implementation does not allow holes in the specified region. This implementation uses the "in transition" flag described below. o Introduce a new flag, "in transition," to the vm_map_entry. Eventually, vm_map_delete() and vm_map_simplify_entry() will respect this flag by deallocating in-transition vm_map_entrys, allowing the vm_map lock to be safely released in vm_map_unwire() and (the forthcoming) vm_map_wire(). o Modify vm_map_simplify_entry() to respect the in-transition flag.
In collaboration with: tegge
|
#
97753 |
|
02-Jun-2002 |
alc |
o Migrate vm_map_split() from vm_map.c to vm_object.c, renaming it to vm_object_split(). Its interface should still be changed to resemble vm_object_shadow().
|
#
97747 |
|
02-Jun-2002 |
alc |
o Style fixes to vm_map_split(), including the elimination of one variable declaration that shadows another.
Note: This function should really be vm_object_split(), not vm_map_split().
Reviewed by: md5
|
#
97727 |
|
01-Jun-2002 |
alc |
o Remove GIANT_REQUIRED from vm_map_zfini(), vm_map_zinit(), vm_map_create(), and vm_map_submap(). o Make further use of a local variable in vm_map_entry_splay() that caches a reference to one of a vm_map_entry's children. (This reduces code size somewhat.) o Revert a part of revision 1.66, deinlining vmspace_pmap(). (This function is MPSAFE.)
|
#
97710 |
|
01-Jun-2002 |
alc |
o Revert a part of revision 1.66, contrary to what that commit message says, deinlining vm_map_entry_behavior() and vm_map_entry_set_behavior() actually increases the kernel's size. o Make vm_map_entry_set_behavior() static and add a comment describing its purpose. o Remove an unnecessary initialization statement from vm_map_entry_splay().
|
#
97648 |
|
31-May-2002 |
alc |
Further work on pushing Giant out of the vm_map layer and down into the vm_object layer: o Acquire and release Giant in vm_object_shadow() and vm_object_page_remove(). o Remove the GIANT_REQUIRED assertion preceding vm_map_delete()'s call to vm_object_page_remove(). o Remove the acquisition and release of Giant around vm_map_lookup()'s call to vm_object_shadow().
|
#
97294 |
|
26-May-2002 |
alc |
o Acquire and release Giant around pmap operations in vm_fault_unwire() and vm_map_delete(). Assert GIANT_REQUIRED in vm_map_delete() only if operating on the kernel_object or the kmem_object. o Remove GIANT_REQUIRED from vm_map_remove(). o Remove the acquisition and release of Giant from munmap().
|
#
97198 |
|
23-May-2002 |
alc |
o Replace the vm_map's hint by the root of a splay tree. By design, the last accessed datum is moved to the root of the splay tree. Therefore, on lookups in which the hint resulted in O(1) access, the splay tree still achieves O(1) access. In contrast, on lookups in which the hint failed miserably, the splay tree achieves amortized logarithmic complexity, resulting in dramatic improvements on vm_maps with a large number of entries. For example, the execution time for replaying an access log from www.cs.rice.edu against the thttpd web server was reduced by 23.5% due to the large number of files simultaneously mmap()ed by this server. (The machine in question has enough memory to cache most of this workload.)
Nothing comes for free: At present, I see a 0.2% slowdown on "buildworld" due to the overhead of maintaining the splay tree. I believe that some or all of this can be eliminated through optimizations to the code.
Developed in collaboration with: Juan E Navarro <jnavarro@cs.rice.edu> Reviewed by: jeff
|
#
96839 |
|
18-May-2002 |
alc |
o Remove GIANT_REQUIRED from vm_map_madvise(). Instead, acquire and release Giant around vm_map_madvise()'s call to pmap_object_init_pt(). o Replace GIANT_REQUIRED in vm_object_madvise() with the acquisition and release of Giant. o Remove the acquisition and release of Giant from madvise().
|
#
96469 |
|
12-May-2002 |
alc |
o Remove GIANT_REQUIRED and an excessive number of blank lines from vm_map_inherit(). (minherit() need not acquire Giant anymore.)
|
#
96441 |
|
12-May-2002 |
alc |
o Acquire and release Giant in vm_object_reference() and vm_object_deallocate(), replacing the assertion GIANT_REQUIRED. o Remove GIANT_REQUIRED from vm_map_protect() and vm_map_simplify_entry(). o Acquire and release Giant around vm_map_protect()'s call to pmap_protect().
Altogether, these changes eliminate the need for mprotect() to acquire and release Giant.
|
#
96087 |
|
05-May-2002 |
alc |
o Move vm_freeze_copyopts() from vm_map.{c.h} to vm_object.{c,h}. It's plainly an operation on a vm_object and belongs in the latter place.
|
#
96080 |
|
05-May-2002 |
alc |
o Condition the compilation of uiomoveco() and vm_uiomove() on ENABLE_VFS_IOOPT. o Add a comment to the effect that this code is experimental support for zero-copy I/O.
|
#
96056 |
|
05-May-2002 |
alc |
o Remove GIANT_REQUIRED from vm_map_lookup() and vm_map_lookup_done(). o Acquire and release Giant around vm_map_lookup()'s call to vm_object_shadow().
|
#
96007 |
|
04-May-2002 |
alc |
o Remove GIANT_REQUIRED from vm_map_lookup_entry() and vm_map_check_protection(). o Call vm_map_check_protection() without Giant held in munmap().
|
#
95942 |
|
02-May-2002 |
alc |
o Change the implementation of vm_map locking to use exclusive locks exclusively. The interface still, however, distinguishes between a shared lock and an exclusive lock.
|
#
95901 |
|
02-May-2002 |
alc |
o Remove dead and lockmgr()-specific debugging code.
|
#
95758 |
|
29-Apr-2002 |
jeff |
Add a new zone flag UMA_ZONE_MTXCLASS. This puts the zone in it's own mutex class. Currently this is only used for kmapentzone because kmapents are are potentially allocated when freeing memory. This is not dangerous though because no other allocations will be done while holding the kmapentzone lock.
|
#
95686 |
|
28-Apr-2002 |
alc |
Pass the caller's file name and line number to the vm_map locking functions.
|
#
95610 |
|
28-Apr-2002 |
alc |
o Introduce and use vm_map_trylock() to replace several direct uses of lockmgr(). o Add missing synchronization to vmspace_swap_count(): Obtain a read lock on the vm_map before traversing it.
|
#
95589 |
|
27-Apr-2002 |
alc |
o Begin documenting the (existing) locking protocol on the vm_map in the same style as sys/proc.h. o Undo the de-inlining of several trivial, MPSAFE methods on the vm_map. (Contrary to the commit message for vm_map.h revision 1.66 and vm_map.c revision 1.206, de-inlining these methods increased the kernel's size.)
|
#
94921 |
|
17-Apr-2002 |
peter |
Do not free the vmspace until p->p_vmspace is set to null. Otherwise statclock can access it in the tail end of statclock_process() at an unfortunate time. This bit me several times on an SMP alpha (UP2000) and the problem went away with this change. I'm not sure why it doesn't break x86 as well. Maybe it's because the clocks are much faster on alpha (HZ=1024 by default).
|
#
94777 |
|
15-Apr-2002 |
peter |
Pass vm_page_t instead of physical addresses to pmap_zero_page[_area]() and pmap_copy_page(). This gets rid of a couple more physical addresses in upper layers, with the eventual aim of supporting PAE and dealing with the physical addressing mostly within pmap. (We will need either 64 bit physical addresses or page indexes, possibly both depending on the circumstances. Leaving this to pmap itself gives more flexibilitly.)
Reviewed by: jake Tested on: i386, ia64 and (I believe) sparc64. (my alpha was hosed)
|
#
92748 |
|
20-Mar-2002 |
jeff |
Remove references to vm_zone.h and switch over to the new uma API.
|
#
92692 |
|
19-Mar-2002 |
jeff |
Quit a warning introduced by UMA. This only occurs on machines where vm_size_t != unsigned long.
Reviewed by: phk
|
#
92654 |
|
19-Mar-2002 |
jeff |
This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator.
Reviewed by: arch@
|
#
92588 |
|
18-Mar-2002 |
green |
Back out the modification of vm_map locks from lockmgr to sx locks. The best path forward now is likely to change the lockmgr locks to simple sleep mutexes, then see if any extra contention it generates is greater than removed overhead of managing local locking state information, cost of extra calls into lockmgr, etc.
Additionally, making the vm_map lock a mutex and respecting it properly will put us much closer to not needing Giant magic in vm.
|
#
92466 |
|
17-Mar-2002 |
alc |
Acquire a read lock on the map inside of vm_map_check_protection() rather than expecting the caller to do so. This (1) eliminates duplicated code in kernacc() and useracc() and (2) fixes missing synchronization in munmap().
|
#
92246 |
|
13-Mar-2002 |
green |
Rename SI_SUB_MUTEX to SI_SUB_MTX_POOL to make the name at all accurate. While doing this, move it earlier in the sysinit boot process so that the VM system can use it.
After that, the system is now able to use sx locks instead of lockmgr locks in the VM system. To accomplish this, some of the more questionable uses of the locks (such as testing whether they are owned or not, as well as allowing shared+exclusive recursion) are removed, and simpler logic throughout is used so locks should also be easier to understand.
This has been tested on my laptop for months, and has not shown any problems on SMP systems, either, so appears quite safe. One more user of lockmgr down, many more to go :)
|
#
92029 |
|
10-Mar-2002 |
eivind |
- Remove a number of extra newlines that do not belong here according to style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs.
Reviewed by: alc
|
#
91777 |
|
07-Mar-2002 |
dillon |
Fix a bug in the vm_map_clean() procedure. msync()ing an area of memory that has just been mapped MAP_ANON|MAP_NOSYNC and has not yet been accessed will panic the machine.
MFC after: 1 day
|
#
90263 |
|
05-Feb-2002 |
alfred |
Fix a race with free'ing vmspaces at process exit when vmspaces are shared.
Also introduce vm_endcopy instead of using pointer tricks when initializing new vmspaces.
The race occured because of how the reference was utilized: test vmspace reference, possibly block, decrement reference
When sharing a vmspace between multiple processes it was possible for two processes exiting at the same time to test the reference count, possibly block and neither one free because they wouldn't see the other's update.
Submitted by: green
|
#
85762 |
|
31-Oct-2001 |
dillon |
Don't let pmap_object_init_pt() exhaust all available free pages (allocating pv entries w/ zalloci) when called in a loop due to an madvise(). It is possible to completely exhaust the free page list and cause a system panic when an expected allocation fails.
|
#
84932 |
|
14-Oct-2001 |
tegge |
Fix locking violations during page wiring:
- vm map entries are not valid after the map has been unlocked.
- An exclusive lock on the map is needed before calling vm_map_simplify_entry().
Fix cleanup after page wiring failure to unwire all pages that had been successfully wired before the failure was detected.
Reviewed by: dillon
|
#
84812 |
|
11-Oct-2001 |
jhb |
Add missing includes of sys/ktr.h.
|
#
84783 |
|
10-Oct-2001 |
ps |
Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader tunable.
Reviewed by: peter MFC after: 2 weeks
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
79248 |
|
04-Jul-2001 |
dillon |
Change inlines back into mainline code in preparation for mutexing. Also, most of these inlines had been bloated in -current far beyond their original intent. Normalize prototypes and function declarations to be ANSI only (half already were). And do some general cleanup.
(kernel size also reduced by 50-100K, but that isn't the prime intent)
|
#
79224 |
|
04-Jul-2001 |
dillon |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
78592 |
|
22-Jun-2001 |
bmilekic |
Introduce numerous SMP friendly changes to the mbuf allocator. Namely, introduce a modified allocation mechanism for mbufs and mbuf clusters; one which can scale under SMP and which offers the possibility of resource reclamation to be implemented in the future. Notable advantages:
o Reduce contention for SMP by offering per-CPU pools and locks. o Better use of data cache due to per-CPU pools. o Much less code cache pollution due to excessively large allocation macros. o Framework for `grouping' objects from same page together so as to be able to possibly free wired-down pages back to the system if they are no longer needed by the network stacks.
Additional things changed with this addition:
- Moved some mbuf specific declarations and initializations from sys/conf/param.c into mbuf-specific code where they belong. - m_getclr() has been renamed to m_get_clrd() because the old name is really confusing. m_getclr() HAS been preserved though and is defined to the new name. No tree sweep has been done "to change the interface," as the old name will continue to be supported and is not depracated. The change was merely done because m_getclr() sounds too much like "m_get a cluster." - TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and systat(1) (see TODO below). - Fixed systat(1) to display number of "free mbufs" based on new per-CPU stat structures. - Fixed netstat(1) to display new per-CPU stats based on sysctl-exported per-CPU stat structures. All infos are fetched via sysctl.
TODO (in order of priority):
- Re-enable mbtypes statistics in both netstat(1) and systat(1) after introducing an SMP friendly way to collect the mbtypes stats under the already introduced per-CPU locks (i.e. hopefully don't use atomic() - it seems too costly for a mere stat update, especially when other locks are already present). - Optionally have systat(1) display not only "total free mbufs" but also "total free mbufs per CPU pool." - Fix minor length-fetching issues in netstat(1) related to recently re-enabled option to read mbuf stats from a core file. - Move reference counters at least for mbuf clusters into an unused portion of the cluster itself, to save space and need to allocate a counter. - Look into introducing resource freeing possibly from a kproc.
Reviewed by (in parts): jlemon, jake, silby, terry Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha) Preliminary performance measurements: jlemon (and me, obviously) URL: http://people.freebsd.org/~bmilekic/mb_alloc/
|
#
78099 |
|
11-Jun-2001 |
dillon |
Cleanup the tabbing
|
#
77948 |
|
09-Jun-2001 |
dillon |
Two fixes to the out-of-swap process termination code. First, start killing processes a little earlier to avoid a deadlock. Second, when calculating the 'largest process' do not just count RSS. Instead count the RSS + SWAP used by the process. Without this the code tended to kill small inconsequential processes like, oh, sshd, rather then one of the many 'eatmem 200MB' I run on a whim :-). This fix has been extensively tested on -stable and somewhat tested on -current and will be MFCd in a few days.
Shamed into fixing this by: ps
|
#
77090 |
|
23-May-2001 |
jhb |
- Add lots of vm_mtx assertions. - Add a few KTR tracepoints to track the addition and removal of vm_map_entry's and the creation adn free'ing of vmspace's. - Adjust a few portions of code so that we update the process' vmspace pointer to its new vmspace before freeing the old vmspace.
|
#
76827 |
|
18-May-2001 |
alfred |
Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level vm operations.
faults can not be taken without holding Giant.
Memory subsystems can now call the base page allocators safely.
Almost all atomic ops were removed as they are covered under the vm mutex.
Alpha and ia64 now need to catch up to i386's trap handlers.
FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties).
Reviewed (partially) by: jake, jhb
|
#
76166 |
|
01-May-2001 |
markm |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files.
Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files.
Sort sys/*.h includes where possible in affected files.
OK'ed by: bde (with reservations)
|
#
75452 |
|
12-Apr-2001 |
alfred |
remove truncated part from commment
|
#
74237 |
|
14-Mar-2001 |
dillon |
Fix a lock reversal problem in the VM subsystem related to threaded programs. There is a case during a fork() which can cause a deadlock.
From Tor - The workaround that consists of setting a flag in the vm map that indicates that a fork is in progress and using that mark in the page fault handling to force a revalidation failure. That change will only affect (pessimize) page fault handling during fork for threaded (linuxthreads style) applications and applications using aio_*().
Submited by: tegge
|
#
74235 |
|
14-Mar-2001 |
dillon |
Temporarily remove the vm_map_simplify() call from vm_map_insert(). The call is correct, but it interferes with the massive hack called vm_map_growstack(). The call will be returned after our stack handling code is fixed.
Reported by: tegge
|
#
74042 |
|
09-Mar-2001 |
iedowse |
When creating a shadow vm_object in vmspace_fork(), only one reference count was transferred to the new object, but both the new and the old map entries had pointers to the new object. Correct this by transferring the second reference.
This fixes a panic that can occur when mmap(2) is used with the MAP_INHERIT flag.
PR: i386/25603 Reviewed by: dillon, alc
|
#
71983 |
|
04-Feb-2001 |
dillon |
This commit represents work mainly submitted by Tor and slightly modified by myself. It solves a serious vm_map corruption problem that can occur with the buffer cache when block sizes > 64K are used. This code has been heavily tested in -stable but only tested somewhat on -current. An MFC will occur in a few days. My additions include the vm_map_simplify_entry() and minor buffer cache boundry case fix.
Make the buffer cache use a system map for buffer cache KVM rather then a normal map.
Ensure that VM objects are not allocated for system maps. There were cases where a buffer map could wind up with a backing VM object -- normally harmless, but this could also result in the buffer cache blocking in places where it assumes no blocking will occur, possibly resulting in corrupted maps.
Fix a minor boundry case in the buffer cache size limit is reached that could result in non-optimal code.
Add vm_map_simplify_entry() calls to prevent 'creeping proliferation' of vm_map_entry's in the buffer cache's vm_map. Previously only a simple linear optimization was made. (The buffer vm_map typically has only a handful of vm_map_entry's. This stabilizes it at that level permanently).
PR: 20609 Submitted by: (Tor Egge) tegge
|
#
69972 |
|
13-Dec-2000 |
tanimura |
- If swap metadata does not fit into the KVM, reduce the number of struct swblock entries by dividing the number of the entries by 2 until the swap metadata fits.
- Reject swapon(2) upon failure of swap_zone allocation.
This is just a temporary fix. Better solutions include: (suggested by: dillon)
o reserving swap in SWAP_META_PAGES chunks, and o swapping the swblock structures themselves.
Reviewed by: alfred, dillon
|
#
68261 |
|
02-Nov-2000 |
tegge |
Clear the MAP_ENTRY_USER_WIRED flag from cloned vm_map entries. PR: 2840
|
#
66615 |
|
03-Oct-2000 |
jasone |
Convert lockmgr locks from using simple locks to using mutexes.
Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
|
#
60557 |
|
14-May-2000 |
dillon |
Fixed bug in madvise() / MADV_WILLNEED. When the request is offset from the base of the first map_entry the call to pmap_object_init_pt() uses the wrong start VA. MFC to follow.
PR: i386/18095
|
#
58705 |
|
27-Mar-2000 |
charnier |
Revert spelling mistake I made in the previous commit Requested by: Alan and Bruce
|
#
58634 |
|
26-Mar-2000 |
charnier |
Spelling
|
#
57550 |
|
28-Feb-2000 |
ps |
Add MAP_NOCORE to mmap(2), and MADV_NOCORE and MADV_CORE to madvise(2). This This feature allows you to specify if mmap'd data is included in an application's corefile.
Change the type of eflags in struct vm_map_entry from u_char to vm_eflags_t (an unsigned int).
Reviewed by: dillon,jdp,alfred Approved by: jkh
|
#
57263 |
|
16-Feb-2000 |
dillon |
Fix null-pointer dereference crash when the system is intentionally run out of KVM through a mmap()/fork() bomb that allocates hundreds of thousands of vm_map_entry structures.
Add panic to make null-pointer dereference crash a little more verbose.
Add a new sysctl, vm.max_proc_mmap, which specifies the maximum number of mmap()'d spaces (discrete vm_map_entry's in the process). The value defaults to around 9000 for a 128MB machine. The test is scaled for the number of processes sharing a vmspace (aka linux threads). Setting the value to 0 disables the feature.
PR: kern/16573 Approved by: jkh
|
#
56378 |
|
21-Jan-2000 |
dillon |
Fix a deadlock between msync(..., MS_INVALIDATE) and vm_fault. The invalidation code cannot wait for paging to complete while holding a vnode lock, so we don't wait. Instead we simply allow the lower level code to simply block on any busy pages it encounters. I think Yahoo may be the only entity in the entire world that actually uses this msync feature :-).
Bug reported by: Paul Saab <paul@mu.org>
|
#
54467 |
|
12-Dec-1999 |
dillon |
Add MAP_NOSYNC feature to mmap(), and MADV_NOSYNC and MADV_AUTOSYNC to madvise().
This feature prevents the update daemon from gratuitously flushing dirty pages associated with a mapped file-backed region of memory. The system pager will still page the memory as necessary and the VM system will still be fully coherent with the filesystem. Modifications made by other means to the same area of memory, for example by write(), are unaffected. The feature works on a page-granularity basis.
MAP_NOSYNC allows one to use mmap() to share memory between processes without incuring any significant filesystem overhead, putting it in the same performance category as SysV Shared memory and anonymous memory.
Reviewed by: julian, alc, dg
|
#
53701 |
|
25-Nov-1999 |
alc |
Remove nonsensical vm_map_{clear,set}_recursive() calls from vm_map_pageable(). At the point they called, vm_map_pageable() holds a read (or shared) lock on the map. The purpose of vm_map_{clear,set}_recursive() is to disable/enable repeated write (or exclusive) lock requests by the same process.
|
#
53627 |
|
23-Nov-1999 |
alc |
Correct the following error: vm_map_pageable() on a COW'ed (post-fork) vm_map always failed because vm_map_lookup() looked at "vm_map_entry->wired_count" instead of "(vm_map_entry->eflags & MAP_ENTRY_USER_WIRED)". The effect was that many page wiring operations by sysctl were (silently) failing.
|
#
52973 |
|
07-Nov-1999 |
alc |
Remove unused #include's.
Submitted by: phk
|
#
52960 |
|
07-Nov-1999 |
alc |
The functions declared by this header file no longer exist.
Submitted by: phk (in part)
|
#
52635 |
|
29-Oct-1999 |
phk |
useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs.
This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
#
51493 |
|
21-Sep-1999 |
dillon |
cleanup madvise code, add a few more sanity checks.
Reviewed by: Alan Cox <alc@cs.rice.edu>, dg@root.com
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
49697 |
|
13-Aug-1999 |
alc |
vm_map_madvise: A complete rewrite by dillon and myself to separate the implementation of behaviors that effect the vm_map_entry from those that effect the vm_object.
A result of this change is that madvise(..., MADV_FREE); is much cheaper.
|
#
49592 |
|
10-Aug-1999 |
alc |
vm_map_madvise: Now that behaviors are stored in the vm_map_entry rather than the vm_object, it's no longer necessary to instantiate a vm_object just to hold the behavior.
Reviewed by: dillon
|
#
49338 |
|
01-Aug-1999 |
alc |
Move the memory access behavior information provided by madvise from the vm_object to the vm_map.
Submitted by: dillon
|
#
48963 |
|
21-Jul-1999 |
alc |
Fix the following problem:
When creating new processes (or performing exec), the new page directory is initialized too early. The kernel might grow before p_vmspace is initialized for the new process. Since pmap_growkernel doesn't yet know about the new page directory, it isn't updated, and subsequent use causes a failure.
The fix is (1) to clear p_vmspace early, to stop pmap_growkernel from stomping on memory, and (2) to defer part of the initialization of new page directories until p_vmspace is initialized.
PR: kern/12378 Submitted by: tegge Reviewed by: dfr
|
#
48757 |
|
11-Jul-1999 |
alc |
Cleanup OBJ_ONEMAPPING management.
vm_map.c: Don't set OBJ_ONEMAPPING on arbitrary vm objects. Only default and swap type vm objects should have it set. vm_object_deallocate already handles these cases.
vm_object.c: If OBJ_ONEMAPPING isn't already clear in vm_object_shadow, we are in trouble. Instead of clearing it, make it an assertion that it is already clear.
|
#
48409 |
|
01-Jul-1999 |
peter |
Fix some int/long printf problems for the Alpha
|
#
47986 |
|
17-Jun-1999 |
alc |
vm_map_growstack uses vmspace::vm_ssize as though it contained the stack size in bytes when in fact it is the stack size in pages.
|
#
47968 |
|
17-Jun-1999 |
alc |
vm_map_insert sometimes extends an existing vm_map entry, rather than creating a new entry. vm_map_stack and vm_map_growstack can panic when a new entry isn't created. Fixed vm_map_stack and vm_map_growstack.
Also, when extending the stack, always set the protection to VM_PROT_ALL.
|
#
47966 |
|
16-Jun-1999 |
alc |
Move vm_map_stack and vm_map_growstack after the definition of the vm_map_clip_end macro. (The next commit will modify vm_map_stack and vm_map_growstack to use vm_map_clip_end.)
|
#
47965 |
|
16-Jun-1999 |
alc |
Remove some unused declarations and duplicate initialization.
|
#
47888 |
|
12-Jun-1999 |
alc |
vm_map_protect: The wrong vm_map_entry is used to determine if writes must not be allowed due to COW.
|
#
47568 |
|
28-May-1999 |
alc |
Avoid the creation of unnecessary shadow objects.
|
#
47290 |
|
18-May-1999 |
alc |
vm_map_insert: General cleanup. Eliminate coalescing checks that are duplicated by vm_object_coalesce.
|
#
47258 |
|
16-May-1999 |
alc |
Add the options MAP_PREFAULT and MAP_PREFAULT_PARTIAL to vm_map_find/insert, eliminating the need for the pmap_object_init_pt calls in imgact_* and mmap.
Reviewed by: David Greenman <dg@root.com>
|
#
47243 |
|
16-May-1999 |
alc |
Remove prototypes for functions that don't exist anymore (vm_map.h).
Remove a useless argument from vm_map_madvise's interface (vm_map.c, vm_map.h, and vm_mmap.c).
Remove a redundant test in vm_uiomove (vm_map.c).
Make two changes to vm_object_coalesce:
1. Determine whether the new range of pages actually overlaps the existing object's range of pages before calling vm_object_page_remove. (Prior to this change almost 90% of the calls to vm_object_page_remove were to remove pages that were beyond the end of the object.)
2. Free any swap space allocated to removed pages.
|
#
47207 |
|
14-May-1999 |
alc |
Simplify vm_map_find/insert's interface: remove the MAP_COPY_NEEDED option.
It never makes sense to specify MAP_COPY_NEEDED without also specifying MAP_COPY_ON_WRITE, and vice versa. Thus, MAP_COPY_ON_WRITE suffices.
Reviewed by: David Greenman <dg@root.com>
|
#
45293 |
|
04-Apr-1999 |
alc |
Two changes to vm_map_delete:
1. Don't bother checking object->ref_count == 1 in order to set OBJ_ONEMAPPING. It's a waste of time. If object->ref_count == 1, vm_map_entry_delete will "run-down" the object and its pages.
2. If object->ref_count == 1, ignore OBJ_ONEMAPPING. Wait for vm_map_entry_delete to "run-down" the object and its pages. Otherwise, we're calling two different procedures to delete the object's pages.
Note: "vmstat -s" will once again show a non-zero value for "pages freed by exiting processes".
|
#
45069 |
|
27-Mar-1999 |
alc |
Mainly, eliminate the comments about share maps. (We don't have share maps any more.) Also, eliminate an incorrect comment that says that we don't coalesce vm_map_entry's. (We do.)
|
#
44928 |
|
21-Mar-1999 |
alc |
Two changes:
Remove more (redundant) map timestamp increments from properly synchronized routines. (Changed: vm_map_entry_link, vm_map_entry_unlink, and vm_map_pageable.)
Micro-optimize vm_map_entry_link and vm_map_entry_unlink, eliminating unnecessary dereferences. At the same time, converted them from macros to inline functions.
|
#
44773 |
|
15-Mar-1999 |
alc |
Two changes:
In general, vm_map_simplify_entry should be performed INSIDE the loop that traverses the map, not outside. (Changed: vm_map_inherit, vm_map_pageable.)
vm_fault_unwire doesn't acquire the map lock (or block holding it). Thus, vm_map_set/clear_recursive shouldn't be called. (Changed: vm_map_user_pageable, vm_map_pageable.)
|
#
44597 |
|
09-Mar-1999 |
alc |
Remove (redundant) map timestamp increments from some properly synchronized routines.
|
#
44569 |
|
08-Mar-1999 |
alc |
Remove an unused variable from vmspace_fork.
|
#
44565 |
|
07-Mar-1999 |
alc |
Change vm_map_growstack to acquire and hold a read lock (instead of a write lock) until it actually needs to modify the vm_map.
Note: it is legal to modify vm_map::hint without holding a write lock.
Submitted by: "Richard Seaman, Jr." <dick@tar.com> with minor changes by myself.
|
#
44396 |
|
02-Mar-1999 |
alc |
Remove the last of the share map code: struct vm_map::is_main_map.
Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>
|
#
44245 |
|
24-Feb-1999 |
dillon |
Remove unnecessary page protects on map_split and collapse operations. Fix bug where an object's OBJ_WRITEABLE/OBJ_MIGHTBEDIRTY flags do not get set under certain circumstances ( page rename case ).
Reviewed by: Alan Cox <alc@cs.rice.edu>, John Dyson
|
#
44146 |
|
19-Feb-1999 |
luoqi |
Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper.
Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>
|
#
44135 |
|
19-Feb-1999 |
dillon |
Submitted by: Alan Cox <alc@cs.rice.edu>
Remove remaining share map garbage from vm_map_lookup() and clean out old #if 0 stuff.
|
#
43923 |
|
12-Feb-1999 |
dillon |
Fix non-fatal bug in vm_map_insert() which improperly cleared OBJ_ONEMAPPING in the case where an object is extended by an additional vm_map_entry must be allocated.
In vm_object_madvise(), remove calll to vm_page_cache() in MADV_FREE case in order to avoid a page fault on page reuse. However, we still mark the page as clean and destroy any swap backing store.
Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
43748 |
|
07-Feb-1999 |
dillon |
Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to attempt to optimize forks but were essentially given-up on due to problems and replaced with an explicit dup of the vm_map_entry structure. Prior to the removal, they were entirely unused.
|
#
43547 |
|
02-Feb-1999 |
dillon |
Submitted by: Alan Cox
The vm_map_insert()/vm_object_coalesce() optimization has been extended to include OBJT_SWAP objects as well as OBJT_DEFAULT objects. This is possible because it costs nothing to extend an OBJT_SWAP object with the new swapper. We can't do this with the old swapper. The old swapper used a linear array that would have had to have been reallocated, costing time as well as a potential low-memory deadlock.
|
#
43493 |
|
01-Feb-1999 |
dillon |
This patch eliminates a pointless test from appearing twice in vm_map_simplify_entry. Basically, once you've verified that the objects in the adjacent vm_map_entry's are the same, either NULL or the same vm_object, there's no point in checking that the objects have the same behavior.
Obtained from: Alan Cox <alc@cs.rice.edu>
|
#
43476 |
|
31-Jan-1999 |
julian |
Submitted by: Alan Cox <alc@cs.rice.edu> Checked by: "Richard Seaman, Jr." <dick@tar.com> Fix the following problem: As the code stands now, growing any stack, and not just the process's main stack, modifies vm->vm_ssize. This is inconsistent with the code earlier in the same procedure.
|
#
43311 |
|
27-Jan-1999 |
dillon |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
#
43209 |
|
26-Jan-1999 |
julian |
Mostly remove the VM_STACK OPTION. This changes the definitions of a few items so that structures are the same whether or not the option itself is enabled. This allows people to enable and disable the option without recompilng the world.
As the author says:
|I ran into a problem pulling out the VM_STACK option. I was aware of this |when I first did the work, but then forgot about it. The VM_STACK stuff |has some code changes in the i386 branch. There need to be corresponding |changes in the alpha branch before it can come out completely.
what is done: | |1) Pull the VM_STACK option out of the header files it appears in. This |really shouldn't affect anything that executes with or without the rest |of the VM_STACK patches. The vm_map_entry will then always have one |extra element (avail_ssize). It just won't be used if the VM_STACK |option is not turned on. | |I've also pulled the option out of vm_map.c. This shouldn't harm anything, |since the routines that are enabled as a result are not called unless |the VM_STACK option is enabled elsewhere. | |2) Add what appears to be appropriate code the the alpha branch, still |protected behind the VM_STACK switch. I don't have an alpha machine, |so we would need to get some testers with alpha machines to try it out. | |Once there is some testing, we can consider making the change permanent |for both i386 and alpha. | [..] | |Once the alpha code is adequately tested, we can pull VM_STACK out |everywhere. |
Submitted by: "Richard Seaman, Jr." <dick@tar.com>
|
#
43138 |
|
24-Jan-1999 |
dillon |
Change all manual settings of vm_page_t->dirty = VM_PAGE_BITS_ALL to use the vm_page_dirty() inline.
The inline can thus do sanity checks ( or not ) over all cases.
|
#
42970 |
|
21-Jan-1999 |
dillon |
General cleanup related to the new pager. We no longer have to worry about conversions of objects to OBJT_SWAP, it is done automatically now.
Replaced manually inserted code with inline calls for busy waiting on pages, which also incidently fixes a potential PG_BUSY race due to the code not running at splvm().
vm_objects no longer have a paging_offset field ( see vm/vm_object.c )
|
#
42957 |
|
21-Jan-1999 |
dillon |
This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues.
Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
|
#
42360 |
|
06-Jan-1999 |
julian |
Add (but don't activate) code for a special VM option to make downward growing stacks more general. Add (but don't activate) code to use the new stack facility when running threads, (specifically the linux threads support). This allows people to use both linux compiled linuxthreads, and also the native FreeBSD linux-threads port.
The code is conditional on VM_STACK. Not using this will produce the old heavily tested system.
Submitted by: Richard Seaman <dick@tar.com>
|
#
40648 |
|
25-Oct-1998 |
phk |
Nitpicking and dusting performed on a train. Removes trivial warnings about unused variables, labels and other lint.
|
#
40286 |
|
13-Oct-1998 |
dg |
Fixed two potentially serious classes of bugs:
1) The vnode pager wasn't properly tracking the file size due to "size" being page rounded in some cases and not in others. This sometimes resulted in corrupted files. First noticed by Terry Lambert. Fixed by changing the "size" pager_alloc parameter to be a 64bit byte value (as opposed to a 32bit page index) and changing the pagers and their callers to deal with this properly. 2) Fixed a bogus type cast in round_page() and trunc_page() that caused some 64bit offsets and sizes to be scrambled. Removing the cast required adding casts at a few dozen callers. There may be problems with other bogus casts in close-by macros. A quick check seemed to indicate that those were okay, however.
|
#
39873 |
|
01-Oct-1998 |
jdp |
Fix a bug in which a page index was used where a byte offset was expected. This bug caused builds of Modula-3 to fail in mysterious ways on SMP kernels. More precisely, such builds failed on systems with kern.fast_vfork equal to 0, the default and only supported value for SMP kernels.
PR: kern/7468 Submitted by: tegge (Tor Egge)
|
#
38799 |
|
04-Sep-1998 |
dfr |
Cosmetic changes to the PAGE_XXX macros to make them consistent with the other objects in vm.
|
#
38517 |
|
24-Aug-1998 |
dfr |
Change various syscalls to use size_t arguments instead of u_int.
Add some overflow checks to read/write (from bde).
Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags and vm_object::paging_in_progress to use operations which are not interruptable.
Reviewed by: Bruce Evans <bde@zeta.org.au>
|
#
38135 |
|
06-Aug-1998 |
dfr |
Protect all modifications to paging_in_progress with splvm(). The i386 managed to avoid corruption of this variable by luck (the compiler used a memory read-modify-write instruction which wasn't interruptable) but other architectures cannot.
With this change, I am now able to 'make buildworld' on the alpha (sfx: the crowd goes wild...)
|
#
37640 |
|
14-Jul-1998 |
bde |
Print pointers using %p instead of attempting to print them by casting them to long, etc. Fixed some nearby printf bogons (sign errors not warned about by gcc, and style bugs, but not truncation of vm_ooffset_t's).
Use slightly less bogus casts for passing pointers to ddb command functions.
|
#
37562 |
|
11-Jul-1998 |
bde |
Fixed printf format errors.
|
#
37555 |
|
11-Jul-1998 |
bde |
Fixed printf format errors.
|
#
37094 |
|
21-Jun-1998 |
bde |
Removed unused includes.
|
#
36735 |
|
07-Jun-1998 |
dfr |
This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change.
The prototype FreeBSD/alpha machdep will follow in a couple of days time.
|
#
36275 |
|
21-May-1998 |
dyson |
Make flushing dirty pages work correctly on filesystems that unexpectedly do not complete writes even with sync I/O requests. This should help the behavior of mmaped files when using softupdates (and perhaps in other circumstances also.)
|
#
36112 |
|
16-May-1998 |
dyson |
An important fix for proper inheritance of backing objects for object splits. Another excellent detective job by Tor. Submitted by: Tor Egge <Tor.Egge@idi.ntnu.no>
|
#
35694 |
|
04-May-1998 |
dyson |
Fix the shm panic. I mistakenly used the shadow_count to keep the object from being split, and instead added an OBJ_NOSPLIT.
|
#
35669 |
|
04-May-1998 |
dyson |
Work around some VM bugs, the worst being an overly aggressive swap space free calculation. More complete fixes will be forthcoming, in a week.
|
#
35615 |
|
02-May-1998 |
dyson |
Another minor cleanup of the split code. Make sure that pages are busied during the entire time, so that the waits for pages being unbusy don't make the objects inconsistant.
|
#
35571 |
|
01-May-1998 |
dyson |
Fix minor bug with new over used swap fix.
|
#
35499 |
|
29-Apr-1998 |
dyson |
Add a needed prototype, and fix a panic problem with the new memory code.
|
#
35497 |
|
29-Apr-1998 |
dyson |
Tighten up management of memory and swap space during map allocation, deallocation cycles. This should provide a measurable improvement on swap and memory allocation on loaded systems. It is unlikely a complete solution. Also, provide more map info with procfs. Chuck Cranor spurred on this improvement.
|
#
35485 |
|
28-Apr-1998 |
dyson |
Fix a pseudo-swap leak problem. This mitigates "leaks" due to freeing partial objects, not freeing entire objects didn't free any of it. Simple fix to the map code. Reviewed by: dg
|
#
34206 |
|
07-Mar-1998 |
dyson |
This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated.
1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.
|
#
33817 |
|
25-Feb-1998 |
dyson |
Fix page prezeroing for SMP, and fix some potential paging-in-progress hangs. The paging-in-progress diagnosis was a result of Tor Egge's excellent detective work. Submitted by: Partially from Tor Egge.
|
#
33758 |
|
23-Feb-1998 |
dyson |
Significantly improve the efficiency of the swap pager, which appears to have declined due to code-rot over time. The swap pager rundown code has been clean-up, and unneeded wakeups removed. Lots of splbio's are changed to splvm's. Also, set the dynamic tunables for the pageout daemon to be more sane for larger systems (thereby decreasing the daemon overheadla.)
|
#
33676 |
|
20-Feb-1998 |
bde |
Removed unused #includes.
|
#
33181 |
|
09-Feb-1998 |
eivind |
Staticize.
|
#
33173 |
|
08-Feb-1998 |
dyson |
Fix an argument to vn_lock. It appears that alot of the vn_lock usage is a bit undisciplined, and should be checked carefully.
|
#
33134 |
|
06-Feb-1998 |
eivind |
Back out DIAGNOSTIC changes.
|
#
33109 |
|
05-Feb-1998 |
dyson |
1) Start using a cleaner and more consistant page allocator instead of the various ad-hoc schemes. 2) When bringing in UPAGES, the pmap code needs to do another vm_page_lookup. 3) When appropriate, set the PG_A or PG_M bits a-priori to both avoid some processor errata, and to minimize redundant processor updating of page tables. 4) Modify pmap_protect so that it can only remove permissions (as it originally supported.) The additional capability is not needed. 5) Streamline read-only to read-write page mappings. 6) For pmap_copy_page, don't enable write mapping for source page. 7) Correct and clean-up pmap_incore. 8) Cluster initial kern_exec pagin. 9) Removal of some minor lint from kern_malloc. 10) Correct some ioopt code. 11) Remove some dead code from the MI swapout routine. 12) Correct vm_object_deallocate (to remove backing_object ref.) 13) Fix dead object handling, that had problems under heavy memory load. 14) Add minor vm_page_lookup improvements. 15) Some pages are not in objects, and make sure that the vm_page.c can properly support such pages. 16) Add some more page deficit handling. 17) Some minor code readability improvements.
|
#
33108 |
|
04-Feb-1998 |
eivind |
Turn DIAGNOSTIC into a new-style option.
|
#
32937 |
|
31-Jan-1998 |
dyson |
Change the busy page mgmt, so that when pages are freed, they MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes."
Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.
|
#
32702 |
|
22-Jan-1998 |
dyson |
VM level code cleanups.
1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM.
This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.)
This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)
|
#
32670 |
|
21-Jan-1998 |
dyson |
Allow gdb to work again.
|
#
32585 |
|
17-Jan-1998 |
dyson |
Tie up some loose ends in vnode/object management. Remove an unneeded config option in pmap. Fix a problem with faulting in pages. Clean-up some loose ends in swap pager memory management.
The system should be much more stable, but all subtile bugs aren't fixed yet.
|
#
32454 |
|
11-Jan-1998 |
dyson |
Fix some vnode management problems, and better mgmt of vnode free list. Fix the UIO optimization code. Fix an assumption in vm_map_insert regarding allocation of swap pagers. Fix an spl problem in the collapse handling in vm_object_deallocate. When pages are freed from vnode objects, and the criteria for putting the associated vnode onto the free list is reached, either put the vnode onto the list, or put it onto an interrupt safe version of the list, for further transfer onto the actual free list. Some minor syntax changes changing pre-decs, pre-incs to post versions. Remove a bogus timeout (that I added for debugging) from vn_lock.
PHK will likely still have problems with the vnode list management, and so do I, but it is better than it was.
|
#
32286 |
|
06-Jan-1998 |
dyson |
Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does.
When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex.
When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore.
A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes.
Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.
|
#
32072 |
|
28-Dec-1997 |
dyson |
Fix the decl of vfs_ioopt, allow LFS to compile again, fix a minor problem with the object cache removal.
|
#
32071 |
|
28-Dec-1997 |
dyson |
Lots of improvements, including restructring the caching and management of vnodes and objects. There are some metadata performance improvements that come along with this. There are also a few prototypes added when the need is noticed. Changes include:
1) Cleaning up vref, vget. 2) Removal of the object cache. 3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore. 4) Correct some missing LK_RETRY's in vn_lock. 5) Correct the page range in the code for msync.
Be gentle, and please give me feedback asap.
|
#
31991 |
|
25-Dec-1997 |
dyson |
The ioopt code is still buggy, but wasn't fully disabled.
|
#
31857 |
|
19-Dec-1997 |
dyson |
Change bogus usage of btoc to atop. The incorrect usage of btoc was pointed out by bde.
|
#
31853 |
|
19-Dec-1997 |
dyson |
Some performance improvements, and code cleanups (including changing our expensive OFF_TO_IDX to btoc whenever possible.)
|
#
31392 |
|
24-Nov-1997 |
bde |
Don't #define max() to get a version that works with vm_ooffset's. Just use qmax().
This should be fixed more generally using overloaded functions.
|
#
31175 |
|
14-Nov-1997 |
tegge |
Simplify map entries during user page wire and user page unwire operations in vm_map_user_pageable().
Check return value of vm_map_lock_upgrade() during a user page wire operation.
|
#
31016 |
|
07-Nov-1997 |
phk |
Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by: -Wunused
|
#
30813 |
|
28-Oct-1997 |
bde |
Removed unused #includes.
|
#
30700 |
|
24-Oct-1997 |
dyson |
Decrease the initial allocation for the zone allocations.
|
#
30354 |
|
12-Oct-1997 |
phk |
Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them.
A couple of finer points by: bde
|
#
30309 |
|
11-Oct-1997 |
phk |
Distribute and statizice a lot of the malloc M_* types.
Substantial input from: bde
|
#
29653 |
|
21-Sep-1997 |
dyson |
Change the M_NAMEI allocations to use the zone allocator. This change plus the previous changes to use the zone allocator decrease the useage of malloc by half. The Zone allocator will be upgradeable to be able to use per CPU-pools, and has more intelligent usage of SPLs. Additionally, it has reasonable stats gathering capabilities, while making most calls inline.
|
#
29316 |
|
12-Sep-1997 |
jlemon |
Do not consider VM_PROT_OVERRIDE_WRITE to be part of the protection entry when handling a fault. This is set by procfs whenever it wants to write to a page, as a means of overriding `r-x COW' entries, but causes failures in the `rwx' case.
Submitted by: bde
|
#
28992 |
|
01-Sep-1997 |
bde |
Removed unused #includes.
|
#
28751 |
|
25-Aug-1997 |
bde |
Fixed type mismatches for functions with args of type vm_prot_t and/or vm_inherit_t. These types are smaller than ints, so the prototypes should have used the promoted type (int) to match the old-style function definitions. They use just vm_prot_t and/or vm_inherit_t. This depends on gcc features to work. I fixed the definitions since this is easiest. The correct fix may be to change the small types to u_int, to optimize for time instead of space.
|
#
28349 |
|
18-Aug-1997 |
fsmp |
Added includes of smp.h for SMP. This eliminates a bazillion warnings about implicit s_lock & friends.
|
#
28345 |
|
18-Aug-1997 |
dyson |
Fix kern_lock so that it will work. Additionally, clean-up some of the VM systems usage of the kernel lock (lockmgr) code. This is a first pass implementation, and is expected to evolve as needed. The API for the lock manager code has not changed, but the underlying implementation has changed significantly. This change should not materially affect our current SMP or UP code without non-standard parameters being used.
|
#
27930 |
|
06-Aug-1997 |
dyson |
Add exposure of some vm_zone allocation stats by sysctl. Also, change the initialization parameters of some zones in VM map. This contains only optimizations and not bugfixes.
|
#
27924 |
|
05-Aug-1997 |
dyson |
Fixed the commit botch that was causing crashes soon after system startup. Due to the error, the initialization of the zone for pv_entries was missing. The system should be usable again.
|
#
27923 |
|
05-Aug-1997 |
dyson |
Another attempt at cleaning up the new memory allocator.
|
#
27922 |
|
05-Aug-1997 |
dyson |
Fix some bugs, document vm_zone better. Add copyright to vm_zone.h. Use the new zone code in pmap.c so that we can get rid of the ugly ad-hoc allocations in pmap.c.
|
#
27905 |
|
04-Aug-1997 |
dyson |
Modify pmap to use our new memory allocator. Also, change the vm_map_entry allocations to be interrupt safe.
|
#
27899 |
|
04-Aug-1997 |
dyson |
Get rid of the ad-hoc memory allocator for vm_map_entries, in lieu of a simple, clean zone type allocator. This new allocator will also be used for machine dependent pmap PV entries.
|
#
27715 |
|
27-Jul-1997 |
dyson |
Fix a very subtile problem that causes unnessary numbers of objects backing a single logical object. Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
26851 |
|
23-Jun-1997 |
tegge |
Don't try upgrading an existing exclusive lock in vm_map_user_pageable. This should close PR kern/3180. Also remove a bogus unconditional call to vm_map_unlock_read in vm_map_lookup.
|
#
26667 |
|
15-Jun-1997 |
dyson |
Fix a reference problem with maps. Only appears to manifest itself when sharing address spaces.
|
#
24848 |
|
12-Apr-1997 |
dyson |
Fully implement vfork. Vfork is now much much faster than even our fork. (On my machine, fork is about 240usecs, vfork is 78usecs.)
Implement rfork(!RFPROC !RFMEM), which allows a thread to divorce its memory from the other threads of a group.
Implement rfork(!RFPROC RFCFDG), which closes all file descriptors, eliminating possible existing shares with other threads/processes.
Implement rfork(!RFPROC RFFDG), which divorces the file descriptors for a thread from the rest of the group.
Fix the case where a thread does an exec. It is almost nonsense for a thread to modify the other threads address space by an exec, so we now automatically divorce the address space before modifying it.
|
#
24691 |
|
07-Apr-1997 |
peter |
The biggie: Get rid of the UPAGES from the top of the per-process address space. (!)
Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit.
Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset.
The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS).
Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork().
The following parts came from John Dyson:
Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out.
Move the upages object (upobj) from the vmspace to the proc structure.
Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.
|
#
24668 |
|
06-Apr-1997 |
dyson |
Make vm_map_protect be more complete about map simplification. This is useful when a process changes it's page range protections very much. Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
24666 |
|
06-Apr-1997 |
dyson |
Fix the gdb executable modify problem. Thanks to the detective work by Alan Cox <alc@cs.rice.edu>, and his description of the problem.
The bug was primarily in procfs_mem, but the mistake likely happened due to the lack of vm system support for the operation. I added better support for selective marking of page dirty flags so that vm_map_pageable(wiring) will not cause this problem again.
The code in procfs_mem is now less bogus (but maybe still a little so.)
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22521 |
|
10-Feb-1997 |
dyson |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed.
Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
22156 |
|
31-Jan-1997 |
dyson |
Another fix to inheriting shared segments. Do the copy on write thing if needed. Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
21940 |
|
21-Jan-1997 |
dyson |
Fix two problems where a NULL object is dereferenced. One problem was in the VM_INHERIT_SHARE case of vmspace_fork, and also in vm_map_madvise. Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
21754 |
|
16-Jan-1997 |
dyson |
Change the map entry flags from bitfields to bitmasks. Allows for some code simplification.
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
21258 |
|
03-Jan-1997 |
dyson |
Undo the collapse breakage (swap space usage problem.)
|
#
21157 |
|
01-Jan-1997 |
dyson |
Guess what? We left alot of the old collapse code that is not needed anymore with the "full" collapse fix that we added about 1yr ago!!! The code has been removed by optioning it out for now, so we can put it back in ASAP if any problems are found.
|
#
21134 |
|
31-Dec-1996 |
dyson |
A very significant improvement in the management of process maps and objects. Previously, "fancy" memory management techniques such as that used by the M3 RTS would have the tendancy of chopping up processes allocated memory into lots of little objects. Alan has come up with some improvements to migtigate the sitution to the point where even the M3 RTS only has one object for bss and it's managed memory (when running CVSUP.) (There are still cases where the situation isn't improved when the system pages -- but this is much much better for the vast majority of cases.) The system will now be able to much more effectively merge map entries.
Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
20993 |
|
28-Dec-1996 |
dyson |
Eliminate the redundancy due to the similarity between the routines vm_map_simplify and vm_map_simplify_entry. Make vm_map_simplify_entry handle wired maps so that we can get rid of vm_map_simplify. Modify the callers of vm_map_simplify to properly use vm_map_simplify_entry. Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
20449 |
|
14-Dec-1996 |
dyson |
Implement closer-to POSIX mlock semantics. The major difference is that we do allow mlock to span unallocated regions (of course, not mlocking them.) We also allow mlocking of RO regions (which the old code couldn't.) The restriction there is that once a RO region is wired (mlocked), it cannot be debugged (or EVER written to.)
Under normal usage, the new mlock code will be a significant improvement over our old stuff.
|
#
20189 |
|
07-Dec-1996 |
dyson |
Expunge inlines...
|
#
20187 |
|
07-Dec-1996 |
dyson |
Fix a map entry leak problem found by DG. Also, de-inline a function vm_map_entry_dispose, because it won't help being inlined.
|
#
20182 |
|
06-Dec-1996 |
dyson |
Make vm_map_insert much more intelligent in the MAP_NOFAULT case so that map entries are coalesced when appropriate. Also, conditionalize some code that is currently not used in vm_map_insert. This mod has been added to eliminate unnecessary map entries in buffer map.
Additionally, there were some cases where map coalescing could be done when it shouldn't. That problem has been resolved.
|
#
20054 |
|
30-Nov-1996 |
dyson |
Implement a new totally dynamic (up to MAXPHYS) buffer kva allocation scheme. Additionally, add the capability for checking for unexpected kernel page faults. The maximum amount of kva space for buffers hasn't been decreased from where it is, but it will now be possible to do so.
This scheme manages the kva space similar to the buffers themselves. If there isn't enough kva space because of usage or fragementation, buffers will be reclaimed until a buffer allocation is successful. This scheme should be very resistant to fragmentation problems until/if the LFS code is fixed and uses the bogus buffer locking scheme -- but a 'fixed' LFS is not likely to use such a scheme.
Now there should be NO problem allocating buffers up to MAXPHYS.
|
#
18298 |
|
14-Sep-1996 |
bde |
Attached vm ddb commands `show map', `show vmochk', `show object', `show vmopag', `show page' and `show pageq'. Moved all vm ddb stuff to the ends of the vm source files.
Changed printf() to db_printf(), `indent' to db_indent, and iprintf() to db_iprintf() in ddb commands. Moved db_indent and db_iprintf() from vm to ddb.
vm_page.c: Don't use __pure. Staticized.
db_output.c: Reduced page width from 80 to 79 to inhibit double spacing for long lines (there are still some problems if words are printed across column 79).
|
#
18178 |
|
08-Sep-1996 |
dyson |
Fixed the use of the wrong variable in vm_map_madvise.
|
#
18163 |
|
08-Sep-1996 |
dyson |
Improve the scalability of certain pmap operations.
|
#
17334 |
|
30-Jul-1996 |
dyson |
Backed out the recent changes/enhancements to the VM code. The problem with the 'shell scripts' was found, but there was a 'strange' problem found with a 486 laptop that we could not find. This commit backs the code back to 25-jul, and will be re-entered after the snapshot in smaller (more easily tested) chunks.
|
#
17294 |
|
27-Jul-1996 |
dyson |
This commit is meant to solve a couple of VM system problems or performance issues.
1) The pmap module has had too many inlines, and so the object file is simply bigger than it needs to be. Some common code is also merged into subroutines. 2) Removal of some *evil* PHYS_TO_VM_PAGE macro calls. Unfortunately, a few have needed to be added also. The removal caused the need for more vm_page_lookups. I added lookup hints to minimize the need for the page table lookup operations. 3) Removal of some bogus performance improvements, that mostly made the code more complex (tracking individual page table page updates unnecessarily). Those improvements actually hurt 386 processors perf (not that people who worry about perf use 386 processors anymore :-)). 4) Changed pv queue manipulations/structures to be TAILQ's. 5) The pv queue code has had some performance problems since day one. Some significant scalability issues are resolved by threading the pv entries from the pmap AND the physical address instead of just the physical address. This makes certain pmap operations run much faster. This does not affect most micro-benchmarks, but should help loaded system performance *significantly*. DG helped and came up with most of the solution for this one. 6) Most if not all pmap bit operations follow the pattern: pmap_test_bit(); pmap_clear_bit(); That made for twice the necessary pv list traversal. The pmap interface now supports only pmap_tc_bit type operations: pmap_[test/clear]_modified, pmap_[test/clear]_referenced. Additionally, the modified routine now takes a vm_page_t arg instead of a phys address. This eliminates a PHYS_TO_VM_PAGE operation. 7) Several rewrites of routines that contain redundant code to use common routines, so that there is a greater likelihood of keeping the cache footprint smaller.
|
#
16993 |
|
07-Jul-1996 |
dg |
In all special cases for spl or page_alloc where kmem_map is check for, mb_map (a submap of kmem_map) must also be checked. Thanks to wcarchive (err...sort of) for demonstrating this bug.
|
#
16409 |
|
16-Jun-1996 |
dyson |
Various bugfixes/cleanups from me and others: 1) Remove potential race conditions on waking up in vm_page_free_wakeup by making sure that it is at splvm(). 2) Fix another bug in vm_map_simplify_entry. 3) Be more complete about converting from default to swap pager when an object grows to be large enough that there can be a problem with data structure allocation under low memory conditions. 4) Make some madvise code more efficient. 5) Added some comments.
|
#
16318 |
|
12-Jun-1996 |
dyson |
Fix some serious errors in vm_map_simplify_entries.
|
#
16026 |
|
30-May-1996 |
dyson |
This commit is dual-purpose, to fix more of the pageout daemon queue corruption problems, and to apply Gary Palmer's code cleanups. David Greenman helped with these problems also. There is still a hang problem using X in small memory machines.
|
#
15978 |
|
29-May-1996 |
dyson |
Make sure that pageout deadlocks cannot occur. There is a problem that the datastructures needed to support the swap pager can take enough space to fully deplete system memory, and cause a deadlock. This change keeps large objects from being filled with dirty pages without the appropriate swap pager datastructures. Right now, default objects greater than 1/4 the size of available system memory are converted to swap objects, thereby eliminating the risk of deadlock.
|
#
15873 |
|
22-May-1996 |
dyson |
Initial support for MADV_FREE, support for pages that we don't care about the contents anymore. This gives us alot of the advantage of freeing individual pages through munmap, but with almost none of the overhead.
|
#
15819 |
|
19-May-1996 |
dyson |
Initial support for mincore and madvise. Both are almost fully supported, except madvise does not page in with MADV_WILLNEED, and MADV_DONTNEED doesn't force dirty pages out.
|
#
15809 |
|
18-May-1996 |
dyson |
This set of commits to the VM system does the following, and contain contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>, Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me:
More usage of the TAILQ macros. Additional minor fix to queue.h. Performance enhancements to the pageout daemon. Addition of a wait in the case that the pageout daemon has to run immediately. Slightly modify the pageout algorithm. Significant revamp of the pmap/fork code: 1) PTE's and UPAGES's are NO LONGER in the process's map. 2) PTE's and UPAGES's reside in their own objects. 3) TOTAL elimination of recursive page table pagefaults. 4) The page directory now resides in the PTE object. 5) Implemented pmap_copy, thereby speeding up fork time. 6) Changed the pv entries so that the head is a pointer and not an entire entry. 7) Significant cleanup of pmap_protect, and pmap_remove. 8) Removed significant amounts of machine dependent fork code from vm_glue. Pushed much of that code into the machine dependent pmap module. 9) Support more completely the reuse of already zeroed pages (Page table pages and page directories) as being already zeroed. Performance and code cleanups in vm_map: 1) Improved and simplified allocation of map entries. 2) Improved vm_map_copy code. 3) Corrected some minor problems in the simplify code. Implemented splvm (combo of splbio and splimp.) The VM code now seldom uses splhigh. Improved the speed of and simplified kmem_malloc. Minor mod to vm_fault to avoid using pre-zeroed pages in the case of objects with backing objects along with the already existant condition of having a vnode. (If there is a backing object, there will likely be a COW... With a COW, it isn't necessary to start with a pre-zeroed page.) Minor reorg of source to perhaps improve locality of ref.
|
#
15583 |
|
03-May-1996 |
phk |
Another sweep over the pmap/vm macros, this time with more focus on the usage. I'm not satisfied with the naming, but now at least there is less bogus stuff around.
|
#
15459 |
|
29-Apr-1996 |
dyson |
Move the map entry allocations from the kmem_map to the kernel_map. As a side effect, correct the associated object offset.
|
#
15018 |
|
03-Apr-1996 |
dyson |
Fixed a problem that the UPAGES of a process were being run down in a suboptimal manner. I had also noticed some panics that appeared to be at least superficially caused by this problem. Also, included are some minor mods to support more general handling of page table page faulting. More details in a future commit.
|
#
14865 |
|
28-Mar-1996 |
dyson |
VM performance improvements, and reorder some operations in VM fault in anticipation of a fix in pmap that will allow the mlock system call to work without panicing the system.
|
#
14864 |
|
28-Mar-1996 |
dyson |
More map_simplify fixes from Alan Cox. This very significanly improves the performance when the map has been chopped up. The map simplify operations really work now. Reviewed by: dyson Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
14610 |
|
12-Mar-1996 |
dyson |
This commit is as a result of a comment by Alan Cox (alc@cs.rice.edu) regarding the "real" problem with maps that we have been having over the last few weeks. He noted that the first_free pointer was left dangling in certain circumstances -- and he was right!!! This should fix the map problems that we were having, and also give us the advantage of being able to simplify maps more aggressively.
|
#
14589 |
|
12-Mar-1996 |
dyson |
Fix the map corruption problem that appears as a u_map allocation error.
|
#
14428 |
|
09-Mar-1996 |
dyson |
Fix two problems: The pmap_remove in vm_map_clean incorrectly unmapped the entire map entry. The new vm_map_simplify_entry code had an error (the offset of the combined map entry was not set correctly.) Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
14366 |
|
04-Mar-1996 |
dyson |
Fix a problem that pages in a mapped region were not always properly invalidated. Now we traverse the object shadow chain properly.
|
#
14360 |
|
03-Mar-1996 |
peter |
Remove the #ifdef notyet from the prototype of vm_map_simplify. John re-enabled the function but missed the prototype, causing a warning.
|
#
14316 |
|
02-Mar-1996 |
dyson |
1) Eliminate unnecessary bzero of UPAGES. 2) Eliminate unnecessary copying of pages during/after forks. 3) Add user map simplification.
|
#
14036 |
|
11-Feb-1996 |
dyson |
Fixed a really bogus problem with msync ripping pages away from objects before they were written. Also, don't allow processes without write access to remove pages from vm_objects.
|
#
13490 |
|
19-Jan-1996 |
dyson |
Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
|
#
13228 |
|
04-Jan-1996 |
wollman |
Convert DDB to new-style option.
|
#
12820 |
|
14-Dec-1995 |
phk |
Another mega commit to staticize things.
|
#
12767 |
|
11-Dec-1995 |
dyson |
Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.
|
#
12662 |
|
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
#
12423 |
|
20-Nov-1995 |
phk |
Remove unused vars & funcs, make things static, protoize a little bit.
|
#
12226 |
|
12-Nov-1995 |
dg |
Moved vm_map_lock call to inside the splhigh protection in vm_map_find(). This closes a probably rare but nonetheless real window that would result in a process hanging or the system panicing.
Reviewed by: dyson, davidg Submitted by: kato@eclogite.eps.nagoya-u.ac.jp (KATO Takenori)
|
#
11709 |
|
23-Oct-1995 |
dyson |
Get rid of machine-dependent NBPG and replace with PAGE_SIZE.
|
#
10344 |
|
26-Aug-1995 |
bde |
Change vm_map_print() to have the correct number and type of args for a ddb command.
|
#
9507 |
|
13-Jul-1995 |
dg |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!!
Much needed overhaul of the VM system. Included in this first round of changes:
1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers".
2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items.
3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed.
4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug.
5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance.
6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain.
7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance.
8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed.
9) Some almost useless debugging code removed.
10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology.
11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended.
12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course).
13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE.
14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes)
TODO:
1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size.
2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness.
3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind.
4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems.
5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
7883 |
|
16-Apr-1995 |
dg |
Moved some zero-initialized variables into .bss. Made code intended to be called only from DDB #ifdef DDB. Removed some completely unused globals.
|
#
7365 |
|
25-Mar-1995 |
dg |
Pass syncio flag to vm_object_clean(). It remains unimplemented, however.
|
#
7246 |
|
22-Mar-1995 |
dg |
Removed unused fifth argument to vm_object_page_clean(). Fixed bug with VTEXT not always getting cleared when it is supposed to. Added check to make sure that vm_object_remove() isn't called with a NULL pager or for a pager for an OBJ_INTERNAL object (neither of which will be on the hash list). Clear OBJ_CANPERSIST if we decide to terminate it because of no resident pages.
|
#
7204 |
|
20-Mar-1995 |
dg |
Added a new boolean argument to vm_object_page_clean that causes it to only toss out clean pages if TRUE.
|
#
7090 |
|
16-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
6816 |
|
01-Mar-1995 |
dg |
Various changes from John and myself that do the following:
New functions create - vm_object_pip_wakeup and pagedaemon_wakeup that are used to reduce the actual number of wakeups. New function vm_page_protect which is used in conjuction with some new page flags to reduce the number of calls to pmap_page_protect. Minor changes to reduce unnecessary spl nesting. Rewrote vm_page_alloc() to improve readability. Various other mostly cosmetic changes.
|
#
6584 |
|
20-Feb-1995 |
dg |
Set page alloced for map entries as valid.
|
#
6351 |
|
14-Feb-1995 |
dg |
Fixed problem with msync causing a panic.
Submitted by: John Dyson
|
#
6129 |
|
02-Feb-1995 |
dg |
swap_pager.c: Fixed long standing bug in freeing swap space during object collapses. Fixed 'out of space' messages from printing out too often. Modified to use new kmem_malloc() calling convention. Implemented an additional stat in the swap pager struct to count the amount of space allocated to that pager. This may be removed at some point in the future. Minimized unnecessary wakeups.
vm_fault.c: Don't try to collect fault stats on 'swapped' processes - there aren't any upages to store the stats in. Changed read-ahead policy (again!).
vm_glue.c: Be sure to gain a reference to the process's map before swapping. Be sure to lose it when done.
kern_malloc.c: Added the ability to specify if allocations are at interrupt time or are 'safe'; this affects what types of pages can be allocated.
vm_map.c: Fixed a variety of map lock problems; there's still a lurking bug that will eventually bite.
vm_object.c: Explicitly initialize the object fields rather than bzeroing the struct. Eliminated the 'rcollapse' code and folded it's functionality into the "real" collapse routine. Moved an object_unlock() so that the backing_object is protected in the qcollapse routine. Make sure nobody fools with the backing_object when we're destroying it. Added some diagnostic code which can be called from the debugger that looks through all the internal objects and makes certain that they all belong to someone.
vm_page.c: Fixed a rather serious logic bug that would result in random system crashes. Changed pagedaemon wakeup policy (again!).
vm_pageout.c: Removed unnecessary page rotations on the inactive queue. Changed the number of pages to explicitly free to just free_reserved level.
Submitted by: John Dyson
|
#
5841 |
|
24-Jan-1995 |
dg |
Added ability to detect sequential faults and DTRT. (swap_pager.c) Added hook for pmap_prefault() and use symbolic constant for new third argument to vm_page_alloc() (vm_fault.c, various) Changed the way that upages and page tables are held. (vm_glue.c) Fixed architectural flaw in allocating pages at interrupt time that was introduced with the merged cache changes. (vm_page.c, various) Adjusted some algorithms to acheive better paging performance and to accomodate the fix for the architectural flaw mentioned above. (vm_pageout.c) Fixed pbuf handling problem, changed policy on handling read-behind page. (vnode_pager.c)
Submitted by: John Dyson
|
#
5464 |
|
10-Jan-1995 |
dg |
Fixed some formatting weirdness that I overlooked in the previous commit.
|
#
5455 |
|
09-Jan-1995 |
dg |
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme.
vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering.
vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption.
vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs.
vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore.
machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme.
machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers.
Submitted by: John Dyson and David Greenman
|
#
5151 |
|
18-Dec-1994 |
dg |
Fixed multiple bogons with the map entry handling.
|
#
5146 |
|
18-Dec-1994 |
dg |
Fixed bug where statically allocated map entries might be freed to the malloc pool...causing a panic.
Submitted by: John Dyson
|
#
5114 |
|
15-Dec-1994 |
dg |
Protect kmem_map modifications with splhigh() to work around a problem with the map being locked at interrupt time.
|
#
3449 |
|
08-Oct-1994 |
phk |
Cosmetics: unused vars, ()'s, #include's &c &c to silence gcc. Reviewed by: davidg
|
#
2112 |
|
18-Aug-1994 |
wollman |
Fix up some sloppy coding practices:
- Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above.
NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
|
#
1835 |
|
04-Aug-1994 |
dg |
Added some code that was accidently left out early in the 1.x -> 2.0 VM system conversion. Submitted by: John Dyson
|
#
1817 |
|
02-Aug-1994 |
dg |
Added $Id$
|
#
1549 |
|
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
1542 |
|
24-May-1994 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r1541, which included commits to RCS files with non-trunk default branches.
|
#
1541 |
|
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|