#
267654 |
|
19-Jun-2014 |
gjb |
Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
266498 |
|
21-May-2014 |
pho |
MFC r265534:
msync(2) must return ENOMEM and not EINVAL when the address is outside the allowed range or when one or more pages are not mapped. This according to The Open Group Base Specifications Issue 7.
Sponsored by: EMC / Isilon storage division
|
#
260208 |
|
02-Jan-2014 |
jhb |
MFC 255708,255711,255731: Extend the support for exempting processes from being killed when swap is exhausted. - Add a new protect(1) command that can be used to set or revoke protection from arbitrary processes. Similar to ktrace it can apply a change to all existing descendants of a process as well as future descendants. - Add a new procctl(2) system call that provides a generic interface for control operations on processes (as opposed to the debugger-specific operations provided by ptrace(2)). procctl(2) uses a combination of idtype_t and an id to identify the set of processes on which to operate similar to wait6(). - Add a PROC_SPROTECT control operation to manage the protection status of a set of processes. MADV_PROTECT still works for backwards compatability. - Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc) the first bit of which is used to track if P_PROTECT should be inherited by new child processes.
|
#
258870 |
|
03-Dec-2013 |
jhb |
MFC 253471,253620,254430,254538: Change mmap() to more optimally use superpages and provide support for tweaking alignment of virtual mappings. - Add a new address space allocation method (VMFS_OPTIMAL_SPACE) for vm_map_find() that will try to alter the alignment of a mapping to match any existing superpage mappings of the object being mapped. If no suitable address range is found with the necessary alignment, vm_map_find() will fall back to using the simple first-fit strategy (VMFS_ANY_SPACE). - Change mmap() without MAP_FIXED, shmat(), shm_map(), and the GEM mapping ioctl to use VMFS_OPTIMAL_SPACE instead of VMFS_ANY_SPACE. - MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n). Requests for n >= number of bits in a pointer or less than the size of a page fail with EINVAL. This matches the API provided by NetBSD. - MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED. It can be used to optimize the chances of using large pages. By default it will align the mapping on a large page boundary (the system is free to choose any large page size to align to that seems best for the mapping request). However, if the object being mapped is already using large pages, then it will align the virtual mapping to match the existing large pages in the object instead. - Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment. MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE. - mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than explicitly using VMFS_SUPER_SPACE. All device objects are forced to use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively equivalent.
PR: ports/184173 (exp-run)
|
#
258144 |
|
14-Nov-2013 |
jhb |
MFC 255497: Fix an off-by-one error when populating mincore(2) entries for skipped entries. lastvecindex references the last valid byte, so the new bytes should come after it.
|
#
253801 |
|
30-Jul-2013 |
jlh |
MFC r253554: Fix a panic in the racct code when munlock(2) is called with incorrect values.
The racct code in sys_munlock() assumed that the boundaries provided by the userland were correct as long as vm_map_unwire() returned successfully. However the latter contains its own logic and sometimes manages to do something out of those boundaries, even if they are buggy. This change makes the racct code to use the accounting done by the vm layer, as it is done in other places such as vm_mlock().
Despite fixing the panic, Alan Cox pointed that this code is still race-y though: two simultaneous callers will produce incorrect values.
Reviewed by: alc
MFC r253556: Fix previous commit when option RACCT is not used.
Approved by: re (kib)
|
#
249079 |
|
04-Apr-2013 |
kib |
MFC r248815: Release the v_writecount reference on the vnode in case of error, before the vnode is vput() in vm_mmap_vnode().
|
#
245787 |
|
22-Jan-2013 |
zont |
MFC r240145: - Simplify VM code by using vmspace_wired_count() for counting wired memory of a process.
MFC r245255: - Reduce kernel size by removing unnecessary pointer indirections.
GENERIC kernel size reduced in 16 bytes and RACCT kernel in 336 bytes.
MFC r245296: - Improve readability of sys_obreak().
MFC r245421: - Get rid of unused function vmspace_wired_count().
|
#
245420 |
|
14-Jan-2013 |
zont |
- Fix r245416. Turn unprivileged mlock off for compatibility. Exactly this behaviour was approved by kib (mentor).
This is a direct commit.
Approved by: kib (mentor)
|
#
245416 |
|
14-Jan-2013 |
zont |
MFC r244384: - Fix locked memory accounting for maps with MAP_WIREFUTURE flag. - Add sysctl vm.old_mlock which may turn such accounting off.
MFC r244385: - Add sysctl to allow unprivileged users to call mlock(2)-family system calls and turn it off for compatibility. - Do not allow to call them inside jail.
Approved by: kib (mentor)
|
#
239577 |
|
22-Aug-2012 |
kib |
MFC r239250: For old mmap syscall, when executing on amd64 or ia64, enforce the PROT_EXEC if prot is non-zero, process is 32bit and kern.elf32.i386_read_exec syscal is enabled.
|
#
239573 |
|
22-Aug-2012 |
kib |
MFC r239247: Adjust the r205536, by allowing a non-zero offset for anonymous mappings for a.out binaries. Apparently, a.out ld.so from FreeBSD 1.1.5.1 can issue such requests.
|
#
234767 |
|
28-Apr-2012 |
kib |
MFC r234556: When MAP_STACK mapping is created, the map entry is created only to cover the initial stack size. For MCL_WIREFUTURE maps, the subsequent call to vm_map_wire() to wire the whole stack region fails due to VM_MAP_WIRE_NOHOLES flag.
Use the VM_MAP_WIRE_HOLESOK to only wire mapped part of the stack.
|
#
234766 |
|
28-Apr-2012 |
alc |
MFC r234039 Fix mincore(2) so that it reports PG_CACHED pages as resident.
|
#
234763 |
|
28-Apr-2012 |
alc |
MFC r232166 Simplify vm_mmap()'s control flow.
Add a comment describing what vm_mmap_to_errno() does.
|
#
233728 |
|
31-Mar-2012 |
kib |
MFC r233100: In vm_object_page_clean(), do not clean OBJ_MIGHTBEDIRTY object flag if the filesystem performed short write and we are skipping the page due to this.
Propogate write error from the pager back to the callers of vm_pageout_flush(). Report the failure to write a page from the requested range as the FALSE return value from vm_object_page_clean(), and propagate it back to msync(2) to return EIO to usermode.
While there, convert the clearobjflags variable in the vm_object_page_clean() and arguments of the helper functions to boolean.
PR: kern/165927
|
#
233001 |
|
15-Mar-2012 |
kib |
MFC r232071: Account the writeable shared mappings backed by file in the vnode v_writecount.
MFC r232103: Place the if() at the right location.
MFC note: the added struct vm_object un_pager.vnp.writemappings member is located after the fields of struct vm_object that could be accessed from the modules.
|
#
231889 |
|
17-Feb-2012 |
kib |
MFC r231526: Close a race due to dropping of the map lock between creating map entry for a shared mapping and marking the entry for inheritance.
|
#
225736 |
|
22-Sep-2011 |
kensmith |
Copy head to stable/9 as part of 9.0-RELEASE release cycle.
Approved by: re (implicit)
|
#
225617 |
|
16-Sep-2011 |
kmacy |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls.
Reviewed by: rwatson Approved by: re (bz)
|
#
225418 |
|
06-Sep-2011 |
kib |
Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs.
Document the changes to flags field to only require the page lock.
Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced.
Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)
|
#
224778 |
|
11-Aug-2011 |
rwatson |
Second-to-last commit implementing Capsicum capabilities in the FreeBSD kernel for FreeBSD 9.0:
Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op.
Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions.
In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit.
Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent.
Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
|
#
223914 |
|
10-Jul-2011 |
kib |
Extract the code to translate VM error into errno, into an exported function vm_mmap_to_errno(). It is useful for the drivers that implement mmap(2)-like functionality, to be able to return error codes consistent with mmap(2).
Sponsored by: The FreeBSD Foundation No objections from: alc MFC after: 1 week
|
#
223913 |
|
10-Jul-2011 |
kib |
Style.
MFC after: 3 days
|
#
223825 |
|
06-Jul-2011 |
trasz |
All the racct_*() calls need to happen with the proc locked. Fixing this won't happen before 9.0. This commit adds "#ifdef RACCT" around all the "PROC_LOCK(p); racct_whatever(p, ...); PROC_UNLOCK(p)" instances, in order to avoid useless locking/unlocking in kernels built without "options RACCT".
|
#
220373 |
|
05-Apr-2011 |
trasz |
Add accounting for most of the memory-related resources.
Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)
|
#
218989 |
|
24-Feb-2011 |
pluknet |
Remove sysctl vm.max_proc_mmap used to protect from KVA space exhaustion. As it was pointed out by Alan Cox, that no longer serves its purpose with the modern UMA allocator compared to the old one used in 4.x days.
The removal of sysctl eliminates max_proc_mmap type overflow leading to the broken mmap(2) seen with large amount of physical memory on arches with factually unbound KVA space (such as amd64). It was found that slightly less than 256GB of physmem was enough to trigger the overflow.
Reviewed by: alc, kib Approved by: avg (mentor) MFC after: 2 months
|
#
216186 |
|
04-Dec-2010 |
trasz |
Fix comment intentation.
|
#
215321 |
|
14-Nov-2010 |
kib |
Do not use __FreeBSD_version prefix for the special osrel version. The ports/Mk/bsd.port.mk uses sys/param.h to fetch osrel, and cannot grok several constants with the prefix.
Reported and tested by: swell.k gmail com MFC after: 1 week
|
#
215309 |
|
14-Nov-2010 |
kib |
Use symbolic names instead of hardcoding values for magic p_osrel constants.
MFC after: 1 week
|
#
212873 |
|
19-Sep-2010 |
alc |
Allow a POSIX shared memory object that is opened for read but not for write to nonetheless be mapped PROT_WRITE and MAP_PRIVATE, i.e., copy-on-write.
(This is a regression in the new implementation of POSIX shared memory objects that is used by HEAD and RELENG_8. This bug does not exist in RELENG_7's user-level, file-based implementation.)
PR: 150260 MFC after: 3 weeks
|
#
212282 |
|
07-Sep-2010 |
rstone |
Fix a typo in r212281. uintptr -> uintptr_t
Pointy hat to: rstone
Approved by: emaste (mentor) MFC after: 2 weeks
|
#
212281 |
|
06-Sep-2010 |
rstone |
In munmap() downgrade the vm_map_lock to a read lock before taking a read lock on the pmc-sx lock. This prevents a deadlock with pmc_log_process_mappings, which has an exclusive lock on pmc-sx and tries to get a read lock on a vm_map. Downgrading the vm_map_lock in munmap allows pmc_log_process_mappings to continue, preventing the deadlock.
Without this change I could cause a deadlock on a multicore 8.1-RELEASE system by having one thread constantly mmap'ing and then munmap'ing a PROT_EXEC mapping in a loop while I repeatedly invoked and stopped pmcstat in system-wide sampling mode.
Reviewed by: fabient Approved by: emaste (mentor) MFC after: 2 weeks
|
#
211937 |
|
28-Aug-2010 |
alc |
Add the MAP_PREFAULT_READ option to mmap(2).
Reviewed by: jhb, kib
|
#
210923 |
|
06-Aug-2010 |
kib |
Add new make_dev_p(9) flag MAKEDEV_ETERNAL to inform devfs that created cdev will never be destroyed. Propagate the flag to devfs vnodes as VV_ETERNVALDEV. Use the flags to avoid acquiring devmtx and taking a thread reference on such nodes.
In collaboration with: pho MFC after: 1 month
|
#
210548 |
|
27-Jul-2010 |
trasz |
Fix commented out resource limit check in mlockall(2). It's still racy, but at least less misleading.
|
#
208574 |
|
26-May-2010 |
alc |
Push down page queues lock acquisition in pmap_enter_object() and pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock.
Assert that the page is managed in pmap_is_referenced().
On powerpc/aim, push down the page queues lock acquisition from moea*_is_modified() and moea*_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock.
Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment.
Correct a spelling error in vm_page_dontneed().
Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable.
Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked.
Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty().
Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions.
Reviewed by: kib
|
#
208504 |
|
24-May-2010 |
alc |
Roughly half of a typical pmap_mincore() implementation is machine- independent code. Move this code into mincore(), and eliminate the page queues lock from pmap_mincore().
Push down the page queues lock into pmap_clear_modify(), pmap_clear_reference(), and pmap_is_modified(). Assert that these functions are never passed an unmanaged page.
Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m: Contrary to what the comment says, pmap_mincore() is not simply an optimization. Without a complete pmap_mincore() implementation, mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED because only the pmap can provide this information.
Eliminate the page queues lock from vfs_setdirty_locked_object(), vm_pageout_clean(), vm_object_page_collect_flush(), and vm_object_page_clean(). Generally speaking, these are all accesses to the page's dirty field, which are synchronized by the containing vm object's lock.
Reduce the scope of the page queues lock in vm_object_madvise() and vm_page_dontneed().
Reviewed by: kib (an earlier version)
|
#
207410 |
|
29-Apr-2010 |
kmacy |
On Alan's advice, rather than do a wholesale conversion on a single architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps.
Supported by: Bitgravity Inc.
Discussed with: alc, jeffr, and kib
|
#
207155 |
|
24-Apr-2010 |
alc |
Resurrect pmap_is_referenced() and use it in mincore(). Essentially, pmap_ts_referenced() is not always appropriate for checking whether or not pages have been referenced because it clears any reference bits that it encounters. For example, in mincore(), clearing the reference bits has two negative consequences. First, it throws off the activity count calculations performed by the page daemon. Specifically, a page on which mincore() has called pmap_ts_referenced() looks less active to the page daemon than it should. Consequently, the page could be deactivated prematurely by the page daemon. Arguably, this problem could be fixed by having mincore() duplicate the activity count calculation on the page. However, there is a second problem for which that is not a solution. In order to clear a reference on a 4KB page, it may be necessary to demote a 2/4MB page mapping. Thus, a mincore() by one process can have the side effect of demoting a superpage mapping within another process!
|
#
205536 |
|
23-Mar-2010 |
jhb |
Reject attempts to create a MAP_ANON mapping with a non-zero offset.
PR: kern/71258 Submitted by: Alexander Best MFC after: 2 weeks
|
#
197712 |
|
02-Oct-2009 |
bz |
Back out the functional parts from r197537. After r197711, affecting all user mappings, mmap no longer needs special treatment.
|
#
197537 |
|
27-Sep-2009 |
simon |
Do not allow mmap with the MAP_FIXED argument to map at address zero. This is done to make it harder to exploit kernel NULL pointer security vulnerabilities. While this of course does not fix vulnerabilities, it does mitigate their impact.
Note that this may break some applications, most likely emulators or similar, which for one reason or another require mapping memory at zero.
This restriction can be disabled with the security.bsd.mmap_zero sysctl variable.
Discussed with: rwatson, bz Tested by: bz (Wine), simon (VirtualBox) Submitted by: jhb
|
#
197348 |
|
20-Sep-2009 |
kib |
Old (a.out) rtld attempts to mmap zero-length region, e.g. when bss of the linked object is zero-length. More old code assumes that mmap of zero length returns success.
For a.out and pre-8 ELF binaries, allow the mmap of zero length.
Reported by: tegge Reviewed by: tegge, alc, jhb MFC after: 3 days
|
#
195693 |
|
14-Jul-2009 |
jhb |
- Change mmap() to fail requests with EINVAL that pass a length of 0. This behavior is mandated by POSIX. - Do not fail requests that pass a length greater than SSIZE_MAX (such as > 2GB on 32-bit platforms). The 'len' parameter is actually an unsigned 'size_t' so negative values don't really make sense.
Submitted by: Alexander Best alexbestms at math.uni-muenster.de Reviewed by: alc Approved by: re (kib) MFC after: 1 week
|
#
194766 |
|
23-Jun-2009 |
kib |
Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid.
The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup.
The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped.
The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4).
Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced.
In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
|
#
193511 |
|
05-Jun-2009 |
rwatson |
Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include.
Discussed with: pjd
|
#
193275 |
|
01-Jun-2009 |
jhb |
Add an extension to the character device interface that allows character device drivers to use arbitrary VM objects to satisfy individual mmap() requests. - A new d_mmap_single(cdev, &foff, objsize, &object, prot) callback is added to cdevsw. This function is called for each mmap() request. If it returns ENODEV, then the mmap() request will fall back to using the device's device pager object and d_mmap(). Otherwise, the method can return a VM object to satisfy this entire mmap() request via *object. It can also modify the starting offset into this object via *foff. This allows device drivers to use the file offset as a cookie to identify specific VM objects. - vm_mmap_vnode() has been changed to call vm_mmap_cdev() directly when mapping V_CHR vnodes. This avoids duplicating all the cdev mmap handling code and simplifies some of vm_mmap_vnode(). - D_VERSION has been bumped to D_VERSION_02. Older device drivers using D_VERSION_01 are still supported.
MFC after: 1 month
|
#
190705 |
|
04-Apr-2009 |
alc |
Retire VM_PROT_READ_IS_EXEC. It was intended to be a micro-optimization, but I see no benefit from it today.
VM_PROT_READ_IS_EXEC was only intended for use on processors that do not distinguish between read and execute permission. On an mmap(2) or mprotect(2), it automatically added execute permission if the caller specified permissions included read permission. The hope was that this would reduce the number of vm map entries needed to implement an address space because there would be fewer neighboring vm map entries that differed only in the presence or absence of VM_PROT_EXECUTE. (See vm/vm_mmap.c revision 1.56.)
Today, I don't see any real applications that benefit from VM_PROT_READ_IS_EXEC. In any case, vm map entries are now organized as a self-adjusting binary search tree instead of an ordered list. So, the need for coalescing vm map entries is not as great as it once was.
|
#
189015 |
|
24-Feb-2009 |
kib |
Revert the addition of the freelist argument for the vm_map_delete() function, done in r188334. Instead, collect the entries that shall be freed, in the deferred_freelist member of the map. Automatically purge the deferred freelist when map is unlocked.
Tested by: pho Reviewed by: alc
|
#
188334 |
|
08-Feb-2009 |
kib |
Do not call vm_object_deallocate() from vm_map_delete(), because we hold the map lock there, and might need the vnode lock for OBJT_VNODE objects. Postpone object deallocation until caller of vm_map_delete() drops the map lock. Link the map entries to be freed into the freelist, that is released by the new helper function vm_map_entry_free_freelist().
Reviewed by: tegge, alc Tested by: pho
|
#
187527 |
|
21-Jan-2009 |
jhb |
Now that vfs_markatime() no longer requires an exclusive lock due to the VOP_MARKATIME() changes, use a shared vnode lock for mmap().
Submitted by: ups
|
#
184168 |
|
22-Oct-2008 |
rwatson |
Update mmap() comment: no more block devices, so no more block device cache coherency questions.
MFC after: 3 days
|
#
183216 |
|
20-Sep-2008 |
kib |
Allow the d_mmap driver methods to use cdevpriv KPI during verification phase of establishing mapping.
Discussed with: rwatson, jhb, rnoland Tested by: rnoland MFC after: 3 days
|
#
182371 |
|
28-Aug-2008 |
attilio |
Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful.
Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
#
181239 |
|
03-Aug-2008 |
trhodes |
Fill in a few sysctl descriptions.
Reviewed by: alc, Matt Dillon <dillon@apollo.backplane.com> Approved by: alc
|
#
179296 |
|
24-May-2008 |
alc |
To date, our implementation of munmap(2) has required that the entirety of the specified range be mapped. Specifically, it has returned EINVAL if the entire range is not mapped. There is not, however, any basis for this in either SuSv2 or our own man page. Moreover, neither Linux nor Solaris impose this requirement. This revision removes this requirement.
Submitted by: Tijl Coosemans PR: 118510 MFC after: 6 weeks
|
#
179076 |
|
17-May-2008 |
alc |
In order to map device memory using superpages, mmap(2) must find a superpage-aligned virtual address for the mapping. Revision 1.65 implemented an overly simplistic and generally ineffectual method for finding a superpage-aligned virtual address. Specifically, it rounds the virtual address corresponding to the end of the data segment up to the next superpage-aligned virtual address. If this virtual address is unallocated, then the device will be mapped using superpages. Unfortunately, in modern times, where applications like the X server dynamically load much of their code, this virtual address is already allocated. In such cases, mmap(2) simply uses the first available virtual address, which is not necessarily superpage aligned.
This revision changes mmap(2) to use a more robust method, specifically, the VMFS_ALIGNED_SPACE option that is now implemented by vm_map_find().
|
#
178630 |
|
28-Apr-2008 |
alc |
vm_map_fixed(), unlike vm_map_find(), does not update "addr", so it can be passed by value.
|
#
177458 |
|
20-Mar-2008 |
kib |
Do not dereference cdev->si_cdevsw, use the dev_refthread() to properly obtain the reference. In particular, this fixes the panic reported in the PR. Remove the comments stating that this needs to be done.
PR: kern/119422 MFC after: 1 week
|
#
177253 |
|
16-Mar-2008 |
rwatson |
In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr.
MFC after: 1 month Discussed with: imp, rink
|
#
175164 |
|
08-Jan-2008 |
jhb |
Add a new file descriptor type for IPC shared memory objects and use it to implement shm_open(2) and shm_unlink(2) in the kernel: - Each shared memory file descriptor is associated with a swap-backed vm object which provides the backing store. Each descriptor starts off with a size of zero, but the size can be altered via ftruncate(2). The shared memory file descriptors also support fstat(2). read(2), write(2), ioctl(2), select(2), poll(2), and kevent(2) are not supported on shared memory file descriptors. - shm_open(2) and shm_unlink(2) are now implemented as system calls that manage shared memory file descriptors. The virtual namespace that maps pathnames to shared memory file descriptors is implemented as a hash table where the hash key is generated via the 32-bit Fowler/Noll/Vo hash of the pathname. - As an extension, the constant 'SHM_ANON' may be specified in place of the path argument to shm_open(2). In this case, an unnamed shared memory file descriptor will be created similar to the IPC_PRIVATE key for shmget(2). Note that the shared memory object can still be shared among processes by sharing the file descriptor via fork(2) or sendmsg(2), but it is unnamed. This effectively serves to implement the getmemfd() idea bandied about the lists several times over the years. - The backing store for shared memory file descriptors are garbage collected when they are not referenced by any open file descriptors or the shm_open(2) virtual namespace.
Submitted by: dillon, peter (previous versions) Submitted by: rwatson (I based this on his version) Reviewed by: alc (suggested converting getmemfd() to shm_open())
|
#
172930 |
|
24-Oct-2007 |
rwatson |
Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms:
mac_<object>_<method/action> mac_<object>_check_<method/action>
The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names.
All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI.
Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
|
#
172779 |
|
18-Oct-2007 |
peter |
Fix cosmetic bug in stale copy of msync_args. 'len' is size_t, not int.
|
#
171902 |
|
20-Aug-2007 |
kib |
Do not drop vm_map lock between doing vm_map_remove() and vm_map_insert(). For this, introduce vm_map_fixed() that does that for MAP_FIXED case.
Dropping the lock allowed for parallel thread to occupy the freed space.
Reported by: Tijl Coosemans <tijl ulyssis org> Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
|
#
171212 |
|
04-Jul-2007 |
peter |
Add freebsd6_ wrappers for mmap/lseek/pread/pwrite/truncate/ftruncate
Approved by: re (kensmith)
|
#
170864 |
|
17-Jun-2007 |
mjacob |
Make sure object is NULL- there is a possible case where you could fall through to it being used w/o being set. Put a break in the default case.
|
#
170170 |
|
31-May-2007 |
attilio |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately.
Requested by: alc Approved by: jeff (mentor)
|
#
169667 |
|
18-May-2007 |
jeff |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
|
#
164033 |
|
06-Nov-2006 |
rwatson |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking.
Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
#
163606 |
|
22-Oct-2006 |
rwatson |
Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead.
This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd.
Obtained from: TrustedBSD Project Sponsored by: SPARTA
|
#
159837 |
|
21-Jun-2006 |
kib |
Make the mincore(2) return ENOMEM when requested range is not fully mapped.
Requested by: Bruno Haible <bruno at clisp org> Reviewed by: alc Approved by: pjd (mentor) MFC after: 1 month
|
#
157920 |
|
21-Apr-2006 |
trhodes |
It seems that POSIX would rather ENODEV returned in place of EINVAL when trying to mmap() an fd that isn't a normal file.
Reference: http://www.opengroup.org/onlinepubs/009695399/functions/mmap.html Submitted by: fanf
|
#
157144 |
|
26-Mar-2006 |
jkoshy |
MFP4: Support for profiling dynamically loaded objects.
Kernel changes:
Inform hwpmc of executable objects brought into the system by kldload() and mmap(), and of their removal by kldunload() and munmap(). A helper function linker_hwpmc_list_objects() has been added to "sys/kern/kern_linker.c" and is used by hwpmc to retrieve the list of currently loaded kernel modules.
The unused `MAPPINGCHANGE' event has been deprecated in favour of separate `MAP_IN' and `MAP_OUT' events; this change reduces space wastage in the log.
Bump the hwpmc's ABI version to "2.0.00". Teach hwpmc(4) to handle the map change callbacks.
Change the default per-cpu sample buffer size to hold 32 samples (up from 16).
Increment __FreeBSD_version.
libpmc(3) changes:
Update libpmc(3) to deal with the new events in the log file; bring the pmclog(3) manual page in sync with the code.
pmcstat(8) changes:
Introduce new options to pmcstat(8): "-r" (root fs path), "-M" (mapfile name), "-q"/"-v" (verbosity control). Option "-k" now takes a kernel directory as its argument but will also work with the older invocation syntax.
Rework string handling in pmcstat(8) to use an opaque type for interned strings. Clean up ELF parsing code and add support for tracking dynamic object mappings reported by a v2.0.00 hwpmc(4).
Report statistics at the end of a log conversion run depending on the requested verbosity level.
Reviewed by: jhb, dds (kernel parts of an earlier patch) Tested by: gallatin (earlier patch)
|
#
151252 |
|
12-Oct-2005 |
dds |
Move execve's access time update functionality into a new vfs_mark_atime() function, and use the new function for performing efficient atime updates in mmap().
Reviewed by: bde MFC after: 2 weeks
|
#
150926 |
|
04-Oct-2005 |
dds |
Update the vnode's access time after an mmap operation on it. Before this change a copy operation with cp(1) would not update the file access times.
According to the POSIX mmap(2) documentation: the st_atime field of the mapped file may be marked for update at any time between the mmap() call and the corresponding munmap() call. The initial read or write reference to a mapped region shall cause the file's st_atime field to be marked for update if it has not already been marked for update.
|
#
150397 |
|
20-Sep-2005 |
peter |
Remove unused (but initialized) variable 'objsize' from vm_mmap()
|
#
145076 |
|
14-Apr-2005 |
csjp |
Move MAC check_vnode_mmap entry point out from being exclusive to MAP_SHARED so that the entry point gets executed un-conditionally. This may be useful for security policies which want to perform access control checks around run-time linking.
-add the mmap(2) flags argument to the check_vnode_mmap entry point so that we can make access control decisions based on the type of mapped object. -update any dependent API around this parameter addition such as function prototype modifications, entry point parameter additions and the inclusion of sys/mman.h header file. -Change the MLS, BIBA and LOMAC security policies so that subject domination routines are not executed unless the type of mapping is shared. This is done to maintain compatibility between the old vm_mmap_vnode(9) and these policies.
Reviewed by: rwatson MFC after: 1 month
|
#
144501 |
|
01-Apr-2005 |
jhb |
- Change the vm_mmap() function to accept an objtype_t parameter specifying the type of object represented by the handle argument. - Allow vm_mmap() to map device memory via cdev objects in addition to vnodes and anonymous memory. Note that mmaping a cdev directly does not currently perform any MAC checks like mapping a vnode does. - Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the cdev the ioctl is acting on rather than trying to find a suitable vnode to map from.
Reviewed by: alc, arch@
|
#
140782 |
|
24-Jan-2005 |
phk |
Don't use VOP_GETVOBJECT, use vp->v_object directly.
|
#
140723 |
|
24-Jan-2005 |
jeff |
- Remove GIANT_REQUIRED where giant is no longer required. - Use VFS_LOCK_GIANT() rather than directly acquiring giant in places where giant is only held because vfs requires it.
Sponsored By: Isilon Systems, Inc.
|
#
139825 |
|
07-Jan-2005 |
imp |
/* -> /*- for license, minor formatting changes
|
#
136961 |
|
26-Oct-2004 |
phk |
Don't clear flags we just checked were not set.
|
#
135727 |
|
24-Sep-2004 |
phk |
XXX mark two places where we do not hold a threadcount on the dev when frobbing the cdevsw.
In both cases we examine only the cdevsw and it is a good question if we weren't better off copying those properties into the cdev in the first place. This question will be revisited.
|
#
134615 |
|
01-Sep-2004 |
alc |
Remove dead code.
|
#
133158 |
|
05-Aug-2004 |
phk |
Remove a product specific workaround for wrong modes when mmap(2)'ing devices. They have had plenty of time to adjust now.
|
#
132999 |
|
02-Aug-2004 |
alc |
Eliminate the acquisition and release of Giant around the call to pmap_mincore() in mincore(2). Either pmap locking exists (alpha, amd64, i386, ia64) or pmap_mincore() is unimplemented (arm, powerpc, sparc64).
|
#
130344 |
|
11-Jun-2004 |
phk |
Deorbit COMPAT_SUNOS.
We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither a sparc32 port nor a SunOS4.x compatibility desire these days.
|
#
129110 |
|
11-May-2004 |
tjr |
To handle orphaned character device vnodes properly in mmap(), check that v_mount is non-null before dereferencing it. If it's null, behave as if MNT_NOEXEC was not set on the mount that originally containined it.
|
#
127961 |
|
06-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999.
Approved by: core
|
#
127879 |
|
05-Apr-2004 |
kan |
Delay permission checks for VCHR vnodes until after vnode is locked in vm_mmap_vnode function, where we can safely check for a special /dev/zero case. Rev. 1.180 has reordered checks and introduced a regression.
Submitted by: alc Was broken by: kan
|
#
127187 |
|
18-Mar-2004 |
guido |
When mmap-ing a file from a noexec mount, be sure not to grant the right to mmap it PROT_EXEC. This also depends on the architecture, as some architextures (e.g. i386) do not distinguish between read and exec pages
Inspired by: http://linux.bkbits.net:8080/linux-2.4/cset@1.1267.1.85 Reviewed by: alc
|
#
127013 |
|
15-Mar-2004 |
truckman |
Make overflow/wraparound checking more robust and unbreak len=0 in vslock(), mlock(), and munlock().
Reviewed by: bde
|
#
127008 |
|
15-Mar-2004 |
truckman |
Style(9) changes.
Pointed out by: bde
|
#
127006 |
|
15-Mar-2004 |
truckman |
Remove redundant suser() check.
|
#
126668 |
|
05-Mar-2004 |
truckman |
Undo the merger of mlock()/vslock and munlock()/vsunlock() and the introduction of kern_mlock() and kern_munlock() in src/sys/kern/kern_sysctl.c 1.150 src/sys/vm/vm_extern.h 1.69 src/sys/vm/vm_glue.c 1.190 src/sys/vm/vm_mmap.c 1.179 because different resource limits are appropriate for transient and "permanent" page wiring requests.
Retain the kern_mlock() and kern_munlock() API in the revived vslock() and vsunlock() functions.
Combine the best parts of each of the original sets of implementations with further code cleanup. Make the mclock() and vslock() implementations as similar as possible.
Retain the RLIMIT_MEMLOCK check in mlock(). Move the most strigent test, which can return EAGAIN, last so that requests that have no hope of ever being satisfied will not be retried unnecessarily.
Disable the test that can return EAGAIN in the vslock() implementation because it will cause the sysctl code to wedge.
Tested by: Cy Schubert <Cy.Schubert AT komquats.com>
|
#
126424 |
|
01-Mar-2004 |
kan |
Pich up a do {} while(0) cleanup by phk that was discarded accidentally in previous revision.
Submitted by: alc
|
#
126332 |
|
27-Feb-2004 |
kan |
Move the code dealing with vnode out of several functions into a single helper function vm_mmap_vnode.
Discussed with: jeffr,alc (a while ago)
|
#
126253 |
|
25-Feb-2004 |
truckman |
Split the mlock() kernel code into two parts, mlock(), which unpacks the syscall arguments and does the suser() permission check, and kern_mlock(), which does the resource limit checking and calls vm_map_wire(). Split munlock() in a similar way.
Enable the RLIMIT_MEMLOCK checking code in kern_mlock().
Replace calls to vslock() and vsunlock() in the sysctl code with calls to kern_mlock() and kern_munlock() so that the sysctl code will obey the wired memory limits.
Nuke the vslock() and vsunlock() implementations, which are no longer used.
Add a member to struct sysctl_req to track the amount of memory that is wired to handle the request.
Modify sysctl_wire_old_buffer() to return an error if its call to kern_mlock() fails. Only wire the minimum of the length specified in the sysctl request and the length specified in its argument list. It is recommended that sysctl handlers that use sysctl_wire_old_buffer() should specify reasonable estimates for the amount of data they want to return so that only the minimum amount of memory is wired no matter what length has been specified by the request.
Modify the callers of sysctl_wire_old_buffer() to look for the error return.
Modify sysctl_old_user to obey the wired buffer length and clean up its implementation.
Reviewed by: bms
|
#
125454 |
|
04-Feb-2004 |
jhb |
Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists.
Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
|
#
123697 |
|
21-Dec-2003 |
alc |
- Correct an error in mincore(2) that has existed since its introduction: mincore(2) should check that the page is valid, not just allocated. Otherwise, it can return a false positive for a page that is not yet resident because it is being read from disk.
|
#
123280 |
|
08-Dec-2003 |
kan |
Remove trailing whitespace.
|
#
123276 |
|
07-Dec-2003 |
alc |
Addendum to revision 1.174: In the case where vm_pager_allocate() is called to create a vnode-backed object, the vnode lock must be held by the caller.
Reported by: truckman Discussed with: kan
|
#
123168 |
|
06-Dec-2003 |
alc |
Fix a deadlock between vm_fault() and vm_mmap(): The expected lock ordering between vm_map and vnode locks is that vm_map locks are acquired first. In revision 1.150 mmap(2) was changed to pass a locked vnode into vm_mmap(). This creates a lock-order reversal when vm_mmap() calls one of the vm_map routines that acquires a vm_map lock. The solution implemented herein is to release the vnode lock in mmap() before calling vm_mmap() and reacquire this lock if necessary in vm_mmap().
Approved by: re (scottl) Reviewed by: jeff, kan, rwatson
|
#
122651 |
|
14-Nov-2003 |
alc |
- Remove long dead code.
|
#
122646 |
|
14-Nov-2003 |
alc |
Changes to msync(2) - Return EBUSY if the region was wired by mlock(2) and MS_INVALIDATE is specified to msync(2). This is required by the Open Group Base Specifications Issue 6. - vm_map_sync() doesn't return KERN_FAILURE. Thus, msync(2) can't possibly return EIO. - The second major loop in vm_map_sync() handles sub maps. Thus, failing on sub maps in the first major loop isn't necessary.
|
#
122384 |
|
09-Nov-2003 |
alc |
- The Open Group Base Specifications Issue 6 specifies that an munmap(2) must return EINVAL if size is zero. Submitted by: tegge - In order to avoid a race condition in multithreaded applications, the check and removal operations by munmap(2) must be in the same critical section. To accomodate this, vm_map_check_protection() is modified to require its caller to obtain at least a read lock on the map.
|
#
122367 |
|
09-Nov-2003 |
alc |
- Remove Giant from msync(2). Giant is still acquired by the lower layers if we drop into the pmap or vnode layers. - Migrate the handling of zero-length msync(2)s into vm_map_sync() so that multithread applications can't change the map between implementing the zero-length hack in msync(2) and reacquiring the map lock in vm_map_sync().
Reviewed by: tegge
|
#
122349 |
|
09-Nov-2003 |
alc |
- Rename vm_map_clean() to vm_map_sync(). This better reflects the fact that msync(2) is its only caller. - Migrate the parts of the old vm_map_clean() that examined the internals of a vm object to a new function vm_object_sync() that is implemented in vm_object.c. At the same, introduce the necessary vm object locking so that vm_map_sync() and vm_object_sync() can be called without Giant.
Reviewed by: tegge
|
#
120837 |
|
05-Oct-2003 |
bms |
Only the super-user should be able to wire pages via the mlock() family of system calls at this time. Remove various #ifdef's to enforce this.
|
#
120531 |
|
27-Sep-2003 |
marcel |
Part 2 of implementing rstacks: add the ability to create rstacks and use the ability on ia64 to map the register stack. The orientation of the stack (i.e. its grow direction) is passed to vm_map_stack() in the overloaded cow argument. Since the grow direction is represented by bits, it is possible and allowed to create bi-directional stacks. This is not an advertised feature, more of a side-effect.
Fix a bug in vm_map_growstack() that's specific to rstacks and which we could only find by having the ability to create rstacks: when the mapped stack ends at the faulting address, we have not actually mapped the faulting address. we need to include or cover the faulting address.
Note that at this time mmap(2) has not been extended to allow the creation of rstacks by processes. If such a need arises, this can be done.
Tested on: alpha, i386, ia64, sparc64
|
#
120422 |
|
24-Sep-2003 |
peter |
Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit systems where the data/stack/etc limits are too big for a 32 bit process.
Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.
Supply an ia32_fixlimits function. Export the clip/default values to sysctl under the compat.ia32 heirarchy.
Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max value rather than the sysctl tweakable variable. This allows mmap to place mappings at sensible locations when limits have been reduced.
Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same method as mmap(0, ...) now does.
Note that we cannot remove all references to the sysctl tweakable maxdsiz etc variables because /etc/login.conf specifies a datasize of 'unlimited'. And that causes exec etc to fail since it can no longer find space to mmap things.
|
#
119858 |
|
07-Sep-2003 |
alc |
Revise the locking in mincore(2).
|
#
118771 |
|
11-Aug-2003 |
bms |
Add the mlockall() and munlockall() system calls. - All those diffs to syscalls.master for each architecture *are* necessary. This needed clarification; the stub code generation for mlockall() was disabled, which would prevent applications from linking to this API (suggested by mux) - Giant has been quoshed. It is no longer held by the code, as the required locking has been pushed down within vm_map.c. - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES to express their intention explicitly. - Inspected at the vmstat, top and vm pager sysctl stats level. Paging-in activity is occurring correctly, using a test harness. - The RES size for a process may appear to be greater than its SIZE. This is believed to be due to mappings of the same shared library page being wired twice. Further exploration is needed. - Believed to back out of allocations and locks correctly (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC).
PR: kern/43426, standards/54223 Reviewed by: jake, alc Approved by: jake (mentor) MFC after: 2 weeks
|
#
117224 |
|
04-Jul-2003 |
phk |
Remove unnecessary cast.
|
#
116678 |
|
22-Jun-2003 |
phk |
Add a f_vnode field to struct file.
Several of the subtypes have an associated vnode which is used for stuff like the f*() functions.
By giving the vnode a speparate field, a number of checks for the specific subtype can be replaced simply with a check for f_vnode != NULL, and we can later free f_data up to subtype specific use.
At this point in time, f_data still points to the vnode, so any code I might have overlooked will still work.
|
#
116653 |
|
21-Jun-2003 |
phk |
Use a do {...} while (0); and a couple of breaks to reduce the level of indentation a bit.
|
#
116226 |
|
11-Jun-2003 |
obrien |
Use __FBSDID().
|
#
116080 |
|
09-Jun-2003 |
alc |
Hold the vm object's lock when performing vm_page_lookup().
|
#
113639 |
|
17-Apr-2003 |
jhb |
suser() does not need the proc lock, just the setting of P_PROTECTED in p_flag needs the lock.
|
#
112881 |
|
31-Mar-2003 |
wes |
Add a facility allowing processes to inform the VM subsystem they are critical and should not be killed when pageout is looking for more memory pages in all the wrong places.
Reviewed by: arch@ Sponsored by: St. Bernard Software
|
#
112835 |
|
29-Mar-2003 |
mux |
The object type can't be OBJT_PHYS in vm_mmap().
Reviewed by: peter
|
#
109153 |
|
12-Jan-2003 |
dillon |
Bow to the whining masses and change a union back into void *. Retain removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.
|
#
109123 |
|
11-Jan-2003 |
dillon |
Change struct file f_data to un_data, a union of the correct struct pointer types, and remove a huge number of casts from code using it.
Change struct xfile xf_data to xun_data (ABI is still compatible).
If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.
|
#
107370 |
|
28-Nov-2002 |
alc |
Lock page field accesses in mincore().
Approved by: re (blanket)
|
#
105718 |
|
22-Oct-2002 |
rwatson |
Invoke mac_check_vnode_mmap() during mmap operations on vnodes, permitting policies to restrict access to memory mapping based on the credential requesting the mapping, the target vnode, the requested rights, or other policy considerations.
Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
103767 |
|
21-Sep-2002 |
jake |
Use the fields in the sysentvec and in the vm map header in place of the constants VM_MIN_ADDRESS, VM_MAXUSER_ADDRESS, USRSTACK and PS_STRINGS. This is mainly so that they can be variable even for the native abi, based on different machine types. Get stack protections from the sysentvec too. This makes it trivial to map the stack non-executable for certain abis, on machines that support it.
|
#
99509 |
|
06-Jul-2002 |
jeff |
- Hold a lock on the vnode acquired from the file table across the call to vm_mmap() as well as the GETATTR etc. - If the handle is a vnode in vm_mmap() assert that it is locked. - Wiggle Giant around a little to account for the extra vnode operation.
|
#
98833 |
|
25-Jun-2002 |
dillon |
Part I of RLIMIT_VMEM implementation. Implement core functionality for a new resource limit that covers a process's entire VM space, including mmap()'d space.
(Part II will be additional code to check RLIMIT_VMEM during exec() but it needs more fleshing out).
PR: kern/18209 Submitted by: Andrey Alekseyev <uitm@zenon.net>, Dmitry Kim <jason@nichego.net> MFC after: 7 days
|
#
98656 |
|
22-Jun-2002 |
alc |
o Remove the unnecessary acquisition and release of Giant around fdrop() in mmap(2).
|
#
98632 |
|
22-Jun-2002 |
alc |
o Reduce the scope of Giant in vm_mmap() to just the code that manipulates a vnode. (Thus, MAP_ANON and MAP_STACK never acquire Giant.)
|
#
98304 |
|
16-Jun-2002 |
alc |
o Remove GIANT_REQUIRED from vm_fault_user_wire(). o Move pmap_pageable() outside of Giant in vm_fault_unwire(). (pmap_pageable() is a no-op on all supported architectures.) o Remove the acquisition and release of Giant from mlock().
|
#
98240 |
|
15-Jun-2002 |
alc |
o Remove the acquisition and release of Giant from munlock().
Reviewed by: tegge
|
#
98226 |
|
14-Jun-2002 |
alc |
o Use vm_map_wire() and vm_map_unwire() in place of vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_clear_recursive() and vm_map_set_recursive(). (They were only used by vm_map_pageable() and vm_map_user_pageable().)
Reviewed by: tegge
|
#
97947 |
|
06-Jun-2002 |
alfred |
fix typo in _SYS_SYSPROTO_H_ case: s/mlockall_args/munlockall_args
Submitted by: Mark Santcroos <marks@ripe.net>
|
#
97556 |
|
30-May-2002 |
alfred |
Check for defined(__i386__) instead of just defined(i386) since the compiler will be updated to only define(__i386__) for ANSI cleanliness.
|
#
97294 |
|
26-May-2002 |
alc |
o Acquire and release Giant around pmap operations in vm_fault_unwire() and vm_map_delete(). Assert GIANT_REQUIRED in vm_map_delete() only if operating on the kernel_object or the kmem_object. o Remove GIANT_REQUIRED from vm_map_remove(). o Remove the acquisition and release of Giant from munmap().
|
#
96875 |
|
18-May-2002 |
alc |
o Eliminate the acquisition and release of Giant from minherit(2). (vm_map_inherit() no longer requires Giant to be held.)
|
#
96839 |
|
18-May-2002 |
alc |
o Remove GIANT_REQUIRED from vm_map_madvise(). Instead, acquire and release Giant around vm_map_madvise()'s call to pmap_object_init_pt(). o Replace GIANT_REQUIRED in vm_object_madvise() with the acquisition and release of Giant. o Remove the acquisition and release of Giant from madvise().
|
#
96832 |
|
18-May-2002 |
alc |
o Remove the acquisition and release of Giant from mprotect().
|
#
96007 |
|
04-May-2002 |
alc |
o Remove GIANT_REQUIRED from vm_map_lookup_entry() and vm_map_check_protection(). o Call vm_map_check_protection() without Giant held in munmap().
|
#
93593 |
|
01-Apr-2002 |
jhb |
Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@
|
#
92727 |
|
19-Mar-2002 |
alfred |
Remove __P.
|
#
92029 |
|
10-Mar-2002 |
eivind |
- Remove a number of extra newlines that do not belong here according to style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs.
Reviewed by: alc
|
#
91406 |
|
27-Feb-2002 |
jhb |
Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.
|
#
90702 |
|
15-Feb-2002 |
bde |
Garbage-collect options ACPI_NO_ENABLE_ON_BOOT, AML_DEBUG, BLEED, DEVICE_SYSCTLS, KEY, LOUTB, NFS_MUIDHASHSIZ, NFS_UIDHASHSIZ, PCI_QUIET and SIMPLELOCK_DEBUG.
|
#
89319 |
|
13-Jan-2002 |
alfred |
Replace ffind_* with fget calls.
Make fget MPsafe.
Make fgetvp and fgetsock use the fget subsystem to reduce code bloat.
Push giant down in fpathconf().
|
#
89306 |
|
13-Jan-2002 |
alfred |
SMP Lock struct file, filedesc and the global file list.
Seigo Tanimura (tanimura) posted the initial delta.
I've polished it quite a bit reducing the need for locking and adapting it for KSE.
Locks:
1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked.
1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex.
1 sx lock for the global filelist.
struct file * fhold(struct file *fp); /* increments reference count on a file */
struct file * fhold_locked(struct file *fp); /* like fhold but expects file to locked */
struct file * ffind_hold(struct thread *, int fd); /* finds the struct file in thread, adds one reference and returns it unlocked */
struct file * ffind_lock(struct thread *, int fd); /* ffind_hold, but returns file locked */
I still have to smp-safe the fget cruft, I'll get to that asap.
|
#
84783 |
|
10-Oct-2001 |
ps |
Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader tunable.
Reviewed by: peter MFC after: 2 weeks
|
#
83986 |
|
26-Sep-2001 |
rwatson |
o Modify access control checks in mmap() to use securelevel_gt() instead of direct variable access.
Obtained from: TrustedBSD Project
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
82612 |
|
30-Aug-2001 |
dillon |
Cleanup
|
#
82290 |
|
24-Aug-2001 |
dillon |
Remove support for the badly broken MAP_INHERIT (from -current only).
|
#
79242 |
|
04-Jul-2001 |
dillon |
whitespace / register cleanup
|
#
79224 |
|
04-Jul-2001 |
dillon |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
77139 |
|
24-May-2001 |
jhb |
Stick VM syscalls back under Giant if the BLEED option is not defined.
|
#
77083 |
|
23-May-2001 |
jhb |
- Obtain Giant in mmap() syscall while messing with file descriptors and vnodes. - Fix an old bug that would leak a reference to a fd if the vnode being mmap'd wasn't of type VREG or VCHR. - Lock Giant in vm_mmap() around calls into the VM that can call into pager routines that need Giant or into other VM routines that need Giant. - Replace code that used a goto to jump around the else branch of a test to use an else branch instead.
|
#
76974 |
|
22-May-2001 |
jhb |
Unlock the VM lock at the end of munlock() instead of locking it again.
|
#
76827 |
|
18-May-2001 |
alfred |
Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level vm operations.
faults can not be taken without holding Giant.
Memory subsystems can now call the base page allocators safely.
Almost all atomic ops were removed as they are covered under the vm mutex.
Alpha and ia64 now need to catch up to i386's trap handlers.
FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties).
Reviewed (partially) by: jake, jhb
|
#
76166 |
|
01-May-2001 |
markm |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files.
Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files.
Sort sys/*.h includes where possible in affected files.
OK'ed by: bde (with reservations)
|
#
68883 |
|
18-Nov-2000 |
dillon |
This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc...
PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>
|
#
65770 |
|
12-Sep-2000 |
bp |
Add three new VOPs: VOP_CREATEVOBJECT, VOP_DESTROYVOBJECT and VOP_GETVOBJECT. They will be used by nullfs and other stacked filesystems to support full cache coherency.
Reviewed in general by: mckusick, dillon
|
#
63897 |
|
26-Jul-2000 |
mckusick |
Clean up the snapshot code so that it no longer depends on the use of the SF_IMMUTABLE flag to prevent writing. Instead put in explicit checking for the SF_SNAPSHOT flag in the appropriate places. With this change, it is now possible to rename and link to snapshot files. It is also possible to set or clear any of the owner, group, or other read bits on the file, though none of the write or execute bits can be set. There is also an explicit test to prevent the setting or clearing of the SF_SNAPSHOT flag via chflags() or fchflags(). Note also that the modify time cannot be changed as it needs to accurately reflect the time that the snapshot was taken.
Submitted by: Robert Watson <rwatson@FreeBSD.org>
|
#
62067 |
|
25-Jun-2000 |
markm |
Nifty idea from Jeroen van Gelderen; don't call a routine to check if we are using the /dev/zero device, just check a flag (supplied by /dev/zero). Reviewed by: dfr
|
#
60757 |
|
21-May-2000 |
peter |
Checkpoint of a new physical memory backed object type, that does not have pv_entries. This is intended for very special circumstances, eg: a certain database that has a 1GB shm segment mapped into 300 processes. That would consume 2GB of kvm just to hold the pv_entries alone. This would not be used on systems unless the physical ram was available, as it's not pageable.
This is a work-in-progress, but is a useful and functional checkpoint. Matt has got some more fixes for it that will be committed soon.
Reviewed by: dillon
|
#
60755 |
|
21-May-2000 |
peter |
Implement an optimization of the VM<->pmap API. Pass vm_page_t's directly to various pmap_*() functions instead of looking up the physical address and passing that. In many cases, the first thing the pmap code was doing was going to a lot of trouble to get back the original vm_page_t, or it's shadow pv_table entry.
Inspired by: John Dyson's 1998 patches.
Also: Eliminate pv_table as a seperate thing and build it into a machine dependent part of vm_page_t. This eliminates having a seperate set of structions that shadow each other in a 1:1 fashion that we often went to a lot of trouble to translate from one to the other. (see above) This happens to save 4 bytes of physical memory for each page in the system. (8 bytes on the Alpha).
Eliminate the use of the phys_avail[] array to determine if a page is managed (ie: it has pv_entries etc). Store this information in a flag. Things like device_pager set it because they create vm_page_t's on the fly that do not have pv_entries. This makes it easier to "unmanage" a page of physical memory (this will be taken advantage of in subsequent commits).
Add a function to add a new page to the freelist. This could be used for reclaiming the previously wasted pages left over from preloaded loader(8) files.
Reviewed by: dillon
|
#
59496 |
|
22-Apr-2000 |
wollman |
Implement POSIX.1b shared memory objects. In this implementation, shared memory objects are regular files; the shm_open(3) routine uses fcntl(2) to set a flag on the descriptor which tells mmap(2) to automatically apply MAP_NOSYNC.
Not objected to by: bde, dillon, dufault, jasone
|
#
58705 |
|
27-Mar-2000 |
charnier |
Revert spelling mistake I made in the previous commit Requested by: Alan and Bruce
|
#
58634 |
|
26-Mar-2000 |
charnier |
Spelling
|
#
57550 |
|
28-Feb-2000 |
ps |
Add MAP_NOCORE to mmap(2), and MADV_NOCORE and MADV_CORE to madvise(2). This This feature allows you to specify if mmap'd data is included in an application's corefile.
Change the type of eflags in struct vm_map_entry from u_char to vm_eflags_t (an unsigned int).
Reviewed by: dillon,jdp,alfred Approved by: jkh
|
#
57263 |
|
16-Feb-2000 |
dillon |
Fix null-pointer dereference crash when the system is intentionally run out of KVM through a mmap()/fork() bomb that allocates hundreds of thousands of vm_map_entry structures.
Add panic to make null-pointer dereference crash a little more verbose.
Add a new sysctl, vm.max_proc_mmap, which specifies the maximum number of mmap()'d spaces (discrete vm_map_entry's in the process). The value defaults to around 9000 for a 128MB machine. The test is scaled for the number of processes sharing a vmspace (aka linux threads). Setting the value to 0 disables the feature.
PR: kern/16573 Approved by: jkh
|
#
55351 |
|
03-Jan-2000 |
guido |
Use MAP_NOSYNC for vnodes without any links in their filesystem.
This is necessary for vmware: it does not use an anonymous mmap for the memory of the virtual system. In stead it creates a temp file an unlinks it. For a 50 MB file, this results in a ot of syncing every 30 seconds.
Reviewed by: Matthew Dillon <dillon@backplane.com>
|
#
54467 |
|
12-Dec-1999 |
dillon |
Add MAP_NOSYNC feature to mmap(), and MADV_NOSYNC and MADV_AUTOSYNC to madvise().
This feature prevents the update daemon from gratuitously flushing dirty pages associated with a mapped file-backed region of memory. The system pager will still page the memory as necessary and the VM system will still be fully coherent with the filesystem. Modifications made by other means to the same area of memory, for example by write(), are unaffected. The feature works on a page-granularity basis.
MAP_NOSYNC allows one to use mmap() to share memory between processes without incuring any significant filesystem overhead, putting it in the same performance category as SysV Shared memory and anonymous memory.
Reviewed by: julian, alc, dg
|
#
52635 |
|
29-Oct-1999 |
phk |
useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs.
This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
#
51493 |
|
21-Sep-1999 |
dillon |
cleanup madvise code, add a few more sanity checks.
Reviewed by: Alan Cox <alc@cs.rice.edu>, dg@root.com
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
49535 |
|
08-Aug-1999 |
phk |
Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>, a few lines into <sys/vnode.h>.
Add a few fields to struct specinfo, paving the way for the fun part.
|
#
47765 |
|
05-Jun-1999 |
alc |
vm_mmap: Insure that device mappings get MAP_PREFAULT(_PARTIAL) set, so that 4M page mappings are used when possible.
Reviewed by: Luoqi Chen <luoqi@watermarkgroup.com>
|
#
47258 |
|
16-May-1999 |
alc |
Add the options MAP_PREFAULT and MAP_PREFAULT_PARTIAL to vm_map_find/insert, eliminating the need for the pmap_object_init_pt calls in imgact_* and mmap.
Reviewed by: David Greenman <dg@root.com>
|
#
47243 |
|
16-May-1999 |
alc |
Remove prototypes for functions that don't exist anymore (vm_map.h).
Remove a useless argument from vm_map_madvise's interface (vm_map.c, vm_map.h, and vm_mmap.c).
Remove a redundant test in vm_uiomove (vm_map.c).
Make two changes to vm_object_coalesce:
1. Determine whether the new range of pages actually overlaps the existing object's range of pages before calling vm_object_page_remove. (Prior to this change almost 90% of the calls to vm_object_page_remove were to remove pages that were beyond the end of the object.)
2. Free any swap space allocated to removed pages.
|
#
47207 |
|
14-May-1999 |
alc |
Simplify vm_map_find/insert's interface: remove the MAP_COPY_NEEDED option.
It never makes sense to specify MAP_COPY_NEEDED without also specifying MAP_COPY_ON_WRITE, and vice versa. Thus, MAP_COPY_ON_WRITE suffices.
Reviewed by: David Greenman <dg@root.com>
|
#
46592 |
|
06-May-1999 |
peter |
Add brackets to silence egcs and help clarity.
|
#
46538 |
|
05-May-1999 |
luoqi |
Don't ignore mmap() address hint below the text section.
|
#
46112 |
|
27-Apr-1999 |
phk |
Suser() simplification:
1: s/suser/suser_xxx/
2: Add new function: suser(struct proc *), prototyped in <sys/proc.h>.
3: s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/
The remaining suser_xxx() calls will be scrutinized and dealt with later.
There may be some unneeded #include <sys/cred.h>, but they are left as an exercise for Bruce.
More changes to the suser() API will come along with the "jail" code.
|
#
45821 |
|
19-Apr-1999 |
peter |
unifdef -DVM_STACK - it's been on for a while for x86 and was checked and appeared to be working for the Alpha some time ago.
|
#
44438 |
|
02-Mar-1999 |
alc |
To avoid a conflict for the vm_map's lock with vm_fault, release the read lock around the subyte operations in mincore. After the lock is reacquired, use the map's timestamp to determine if we need to restart the scan.
|
#
44379 |
|
01-Mar-1999 |
alc |
mincore doesn't modify the vm_map. Therefore, it doesn't require an exclusive lock. A read lock will suffice.
|
#
44146 |
|
19-Feb-1999 |
luoqi |
Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper.
Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>
|
#
43748 |
|
07-Feb-1999 |
dillon |
Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to attempt to optimize forks but were essentially given-up on due to problems and replaced with an explicit dup of the vm_map_entry structure. Prior to the removal, they were entirely unused.
|
#
43209 |
|
26-Jan-1999 |
julian |
Mostly remove the VM_STACK OPTION. This changes the definitions of a few items so that structures are the same whether or not the option itself is enabled. This allows people to enable and disable the option without recompilng the world.
As the author says:
|I ran into a problem pulling out the VM_STACK option. I was aware of this |when I first did the work, but then forgot about it. The VM_STACK stuff |has some code changes in the i386 branch. There need to be corresponding |changes in the alpha branch before it can come out completely.
what is done: | |1) Pull the VM_STACK option out of the header files it appears in. This |really shouldn't affect anything that executes with or without the rest |of the VM_STACK patches. The vm_map_entry will then always have one |extra element (avail_ssize). It just won't be used if the VM_STACK |option is not turned on. | |I've also pulled the option out of vm_map.c. This shouldn't harm anything, |since the routines that are enabled as a result are not called unless |the VM_STACK option is enabled elsewhere. | |2) Add what appears to be appropriate code the the alpha branch, still |protected behind the VM_STACK switch. I don't have an alpha machine, |so we would need to get some testers with alpha machines to try it out. | |Once there is some testing, we can consider making the change permanent |for both i386 and alpha. | [..] | |Once the alpha code is adequately tested, we can pull VM_STACK out |everywhere. |
Submitted by: "Richard Seaman, Jr." <dick@tar.com>
|
#
42957 |
|
21-Jan-1999 |
dillon |
This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues.
Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
|
#
42360 |
|
06-Jan-1999 |
julian |
Add (but don't activate) code for a special VM option to make downward growing stacks more general. Add (but don't activate) code to use the new stack facility when running threads, (specifically the linux threads support). This allows people to use both linux compiled linuxthreads, and also the native FreeBSD linux-threads port.
The code is conditional on VM_STACK. Not using this will produce the old heavily tested system.
Submitted by: Richard Seaman <dick@tar.com>
|
#
41620 |
|
09-Dec-1998 |
dt |
Don't disable mmap with large file offset.
|
#
40286 |
|
13-Oct-1998 |
dg |
Fixed two potentially serious classes of bugs:
1) The vnode pager wasn't properly tracking the file size due to "size" being page rounded in some cases and not in others. This sometimes resulted in corrupted files. First noticed by Terry Lambert. Fixed by changing the "size" pager_alloc parameter to be a 64bit byte value (as opposed to a 32bit page index) and changing the pagers and their callers to deal with this properly. 2) Fixed a bogus type cast in round_page() and trunc_page() that caused some 64bit offsets and sizes to be scrambled. Removing the cast required adding casts at a few dozen callers. There may be problems with other bogus casts in close-by macros. A quick check seemed to indicate that those were okay, however.
|
#
38799 |
|
04-Sep-1998 |
dfr |
Cosmetic changes to the PAGE_XXX macros to make them consistent with the other objects in vm.
|
#
38517 |
|
24-Aug-1998 |
dfr |
Change various syscalls to use size_t arguments instead of u_int.
Add some overflow checks to read/write (from bde).
Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags and vm_object::paging_in_progress to use operations which are not interruptable.
Reviewed by: Bruce Evans <bde@zeta.org.au>
|
#
37649 |
|
15-Jul-1998 |
bde |
Cast pointers to uintptr_t/intptr_t instead of to u_long/long, respectively. Most of the longs should probably have been u_longs, but this changes is just to prevent warnings about casts between pointers and integers of different sizes, not to fix poorly chosen types.
|
#
37395 |
|
05-Jul-1998 |
dfr |
Don't truncate the return value of mmap to sizeof(int).
|
#
37101 |
|
21-Jun-1998 |
bde |
Removed unused includes.
|
#
36735 |
|
07-Jun-1998 |
dfr |
This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change.
The prototype FreeBSD/alpha machdep will follow in a couple of days time.
|
#
36177 |
|
19-May-1998 |
peter |
Make the previous commit compile..
|
#
36164 |
|
18-May-1998 |
guido |
Plug hole reported on Bugtraq: do not allow mmap with WRITE privs for append-only and immutable files.
Obtained from: OpenBSD (partly)
|
#
34525 |
|
12-Mar-1998 |
guido |
Fix for mmap of char devices bug as described in OpenBSD advisory of 1998/02/20 Reviewed by: John Dyson Submitted by: "Cy Schubert" <cschuber@uumail.gov.bc.ca>
|
#
34206 |
|
07-Mar-1998 |
dyson |
This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated.
1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.
|
#
33134 |
|
06-Feb-1998 |
eivind |
Back out DIAGNOSTIC changes.
|
#
33108 |
|
04-Feb-1998 |
eivind |
Turn DIAGNOSTIC into a new-style option.
|
#
32132 |
|
31-Dec-1997 |
alex |
caddr_t --> void *
|
#
31778 |
|
16-Dec-1997 |
eivind |
Make COMPAT_43 and COMPAT_SUNOS new-style options.
|
#
30994 |
|
06-Nov-1997 |
phk |
Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead.
This fixes a boatload of compiler warning, and removes a lot of cruft from the sources.
I have not removed the /*ARGSUSED*/, they will require some looking at.
libkvm, ps and other userland struct proc frobbing programs will need recompiled.
|
#
28992 |
|
01-Sep-1997 |
bde |
Removed unused #includes.
|
#
28940 |
|
30-Aug-1997 |
peter |
Allow non-page aligned file offset mmap's, providing that the system is allowed to choose the address, or that the MAP_FIXED address has the same remainder when modulo PAGE_SIZE as the file offset. Apparently this is posix1003.1b specified behavior. SVR4 and the other *BSD's allow it too. It costs us nothing to support and means we don't get EINVAL on some mmap code that works perfectly elsewhere.
Obtained from: NetBSD
|
#
28751 |
|
25-Aug-1997 |
bde |
Fixed type mismatches for functions with args of type vm_prot_t and/or vm_inherit_t. These types are smaller than ints, so the prototypes should have used the promoted type (int) to match the old-style function definitions. They use just vm_prot_t and/or vm_inherit_t. This depends on gcc features to work. I fixed the definitions since this is easiest. The correct fix may be to change the small types to u_int, to optimize for time instead of space.
|
#
27464 |
|
17-Jul-1997 |
dyson |
Add support for 4MB pages. This includes the .text, .data, .data parts of the kernel, and also most of the dynamic parts of the kernel. Additionally, 4MB pages will be allocated for display buffers as appropriate (only.)
The 4MB support for SMP isn't complete, but doesn't interfere with operation either.
|
#
26668 |
|
15-Jun-1997 |
dyson |
Correct the return code for the mlock system call. Also add the stubs for mlockall and munlockall.
|
#
24131 |
|
23-Mar-1997 |
bde |
Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined. Fixed everything that depended on getting fcntl.h stuff from the wrong place. Most things don't depend on file.h stuff at all.
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22521 |
|
10-Feb-1997 |
dyson |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed.
Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
21754 |
|
16-Jan-1997 |
dyson |
Change the map entry flags from bitfields to bitmasks. Allows for some code simplification.
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
21529 |
|
11-Jan-1997 |
dyson |
Prepare better for multi-platform by eliminating another required pmap routine (pmap_is_referenced.) Upper level recoded to use pmap_ts_referenced.
|
#
21039 |
|
30-Dec-1996 |
dyson |
Let the VM system know that on certain arch's that VM_PROT_READ also implies VM_PROT_EXEC. We support it that way for now, since the break system call by default gives VM_PROT_ALL. Now we have a better chance of coalesing map entries when mixing mmap/break type operations. This was contributing to excessive numbers of map entries on the modula-3 runtime system. The problem is still not "solved", but the situation makes more sense.
Eventually, when we work on architectures where VM_PROT_READ is orthogonal to VM_PROT_EXEC, we will have to visit this issue carefully (esp. regarding security issues.)
|
#
20991 |
|
28-Dec-1996 |
dyson |
The code unnecessarily created an object with no handle up-front, which has the negative effect of disabling some map optimizations. This patch defers the creation of the object until it needs to be at fault time. Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
20821 |
|
22-Dec-1996 |
joerg |
Make DFLDSIZ and MAXDSIZ fully-supported options.
"Don't forget to do a ``make depend''" :-)
|
#
20449 |
|
14-Dec-1996 |
dyson |
Implement closer-to POSIX mlock semantics. The major difference is that we do allow mlock to span unallocated regions (of course, not mlocking them.) We also allow mlocking of RO regions (which the old code couldn't.) The restriction there is that once a RO region is wired (mlocked), it cannot be debugged (or EVER written to.)
Under normal usage, the new mlock code will be a significant improvement over our old stuff.
|
#
19259 |
|
29-Oct-1996 |
dyson |
Change mmap to use OBJT_DEFAULT instead of OBJT_SWAP by default for anonymous objects. The system will automatically change the type to SWAP if needed (for size or pageout reasons.)
|
#
19142 |
|
24-Oct-1996 |
dyson |
Remove a bogus optimization in the mmap code. It is superfluous, and at best is the same speed as the unoptimized code. At worst, it slows down trivial programs.
|
#
18908 |
|
13-Oct-1996 |
phk |
Remove a stale comment.
|
#
18389 |
|
19-Sep-1996 |
dg |
Fixed bug with reversed trunc/round_page() in madvise...start must be trunced, end must be rounded.
|
#
17334 |
|
30-Jul-1996 |
dyson |
Backed out the recent changes/enhancements to the VM code. The problem with the 'shell scripts' was found, but there was a 'strange' problem found with a 486 laptop that we could not find. This commit backs the code back to 25-jul, and will be re-entered after the snapshot in smaller (more easily tested) chunks.
|
#
17313 |
|
28-Jul-1996 |
dg |
Slight performance tweak for previous commit.
|
#
17301 |
|
27-Jul-1996 |
dyson |
Allow sequentially created mmap'ed anonymous regions to coalesce. There is little or no reason to create a swap pager for small mmap's. The vm_map_insert code will automatically create a swap pager if the object becomes too large. This fix, per a request from phk.
|
#
17297 |
|
27-Jul-1996 |
dyson |
Remove experimental header file. My test-build must have picked it up in an unexpected place. Submitted by: jkh
|
#
17294 |
|
27-Jul-1996 |
dyson |
This commit is meant to solve a couple of VM system problems or performance issues.
1) The pmap module has had too many inlines, and so the object file is simply bigger than it needs to be. Some common code is also merged into subroutines. 2) Removal of some *evil* PHYS_TO_VM_PAGE macro calls. Unfortunately, a few have needed to be added also. The removal caused the need for more vm_page_lookups. I added lookup hints to minimize the need for the page table lookup operations. 3) Removal of some bogus performance improvements, that mostly made the code more complex (tracking individual page table page updates unnecessarily). Those improvements actually hurt 386 processors perf (not that people who worry about perf use 386 processors anymore :-)). 4) Changed pv queue manipulations/structures to be TAILQ's. 5) The pv queue code has had some performance problems since day one. Some significant scalability issues are resolved by threading the pv entries from the pmap AND the physical address instead of just the physical address. This makes certain pmap operations run much faster. This does not affect most micro-benchmarks, but should help loaded system performance *significantly*. DG helped and came up with most of the solution for this one. 6) Most if not all pmap bit operations follow the pattern: pmap_test_bit(); pmap_clear_bit(); That made for twice the necessary pv list traversal. The pmap interface now supports only pmap_tc_bit type operations: pmap_[test/clear]_modified, pmap_[test/clear]_referenced. Additionally, the modified routine now takes a vm_page_t arg instead of a phys address. This eliminates a PHYS_TO_VM_PAGE operation. 7) Several rewrites of routines that contain redundant code to use common routines, so that there is a greater likelihood of keeping the cache footprint smaller.
|
#
16026 |
|
30-May-1996 |
dyson |
This commit is dual-purpose, to fix more of the pageout daemon queue corruption problems, and to apply Gary Palmer's code cleanups. David Greenman helped with these problems also. There is still a hang problem using X in small memory machines.
|
#
15819 |
|
19-May-1996 |
dyson |
Initial support for mincore and madvise. Both are almost fully supported, except madvise does not page in with MADV_WILLNEED, and MADV_DONTNEED doesn't force dirty pages out.
|
#
15809 |
|
18-May-1996 |
dyson |
This set of commits to the VM system does the following, and contain contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>, Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me:
More usage of the TAILQ macros. Additional minor fix to queue.h. Performance enhancements to the pageout daemon. Addition of a wait in the case that the pageout daemon has to run immediately. Slightly modify the pageout algorithm. Significant revamp of the pmap/fork code: 1) PTE's and UPAGES's are NO LONGER in the process's map. 2) PTE's and UPAGES's reside in their own objects. 3) TOTAL elimination of recursive page table pagefaults. 4) The page directory now resides in the PTE object. 5) Implemented pmap_copy, thereby speeding up fork time. 6) Changed the pv entries so that the head is a pointer and not an entire entry. 7) Significant cleanup of pmap_protect, and pmap_remove. 8) Removed significant amounts of machine dependent fork code from vm_glue. Pushed much of that code into the machine dependent pmap module. 9) Support more completely the reuse of already zeroed pages (Page table pages and page directories) as being already zeroed. Performance and code cleanups in vm_map: 1) Improved and simplified allocation of map entries. 2) Improved vm_map_copy code. 3) Corrected some minor problems in the simplify code. Implemented splvm (combo of splbio and splimp.) The VM code now seldom uses splhigh. Improved the speed of and simplified kmem_malloc. Minor mod to vm_fault to avoid using pre-zeroed pages in the case of objects with backing objects along with the already existant condition of having a vnode. (If there is a backing object, there will likely be a COW... With a COW, it isn't necessary to start with a pre-zeroed page.) Minor reorg of source to perhaps improve locality of ref.
|
#
15583 |
|
03-May-1996 |
phk |
Another sweep over the pmap/vm macros, this time with more focus on the usage. I'm not satisfied with the naming, but now at least there is less bogus stuff around.
|
#
14638 |
|
16-Mar-1996 |
dg |
Force device mappings to always be shared. It doesn't make sense for them to ever be COW and we need the mappings to be shared for backward compatibilty.
Reviewed by: dyson
|
#
14574 |
|
12-Mar-1996 |
dyson |
Allow mmap'ed devices to work correctly across forks. The sanest solution appeared to be to allow the child to maintain the same mapping as the parent.
|
#
14325 |
|
02-Mar-1996 |
peter |
Oops.. I nearly forgot the actual core of the length/rounding/etc fixes that Bruce asked for.
These still are not quite perfect, and in particular, it can get upset on extreme boundary cases (addr = 0xfff, len = 0xffffffff, which would end up mapping a single page rather than failing), but this is better code that I committed before.
(note, the VM system does not (apparently) support single mmap segment sizes above 0x80000000 anyway)
|
#
14316 |
|
02-Mar-1996 |
dyson |
1) Eliminate unnecessary bzero of UPAGES. 2) Eliminate unnecessary copying of pages during/after forks. 3) Add user map simplification.
|
#
14221 |
|
23-Feb-1996 |
peter |
kern_descrip.c: add fdshare()/fdcopy() kern_fork.c: add the tiny bit of code for rfork operation. kern/sysv_*: shmfork() takes one less arg, it was never used. sys/shm.h: drop "isvfork" arg from shmfork() prototype sys/param.h: declare rfork args.. (this is where OpenBSD put it..) sys/filedesc.h: protos for fdshare/fdcopy. vm/vm_mmap.c: add minherit code, add rounding to mmap() type args where it makes sense. vm/*: drop unused isvfork arg.
Note: this rfork() implementation copies the address space mappings, it does not connect the mappings together. ie: once the two processes have split, the pages may be shared, but the address space is not. If one does a mmap() etc, it does not appear in the other. This makes it not useful for pthreads, but it is useful in it's own right for having light-weight threads in a static shared address space.
Obtained from: Original by Ron Minnich, extended by OpenBSD
|
#
13490 |
|
19-Jan-1996 |
dyson |
Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
|
#
12904 |
|
17-Dec-1995 |
bde |
Fixed 1TB filesize changes. Some pindexes had bogus names and types but worked because vm_pindex_t is indistinuishable from vm_offset_t.
|
#
12808 |
|
13-Dec-1995 |
dyson |
There was a bug that the size for an msync'ed region was not rounded up. The effect of this was that msync with a size would generally sync 1 page less than it should. This problem was brought to my attention by Darrel Herbst <dherbst@gradin.cis.upenn.edu> and Ron Minnich <rminnich@sarnoff.com>.
|
#
12767 |
|
11-Dec-1995 |
dyson |
Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.
|
#
12662 |
|
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
#
12591 |
|
03-Dec-1995 |
bde |
Completed function declarations and/or added prototypes.
Staticized some functions.
__purified some functions. Some functions were bogusly declared as returning `const'. This hasn't done anything since gcc-2.5. For later versions of gcc, the equivalent is __attribute__((const)) at the end of function declarations.
|
#
12221 |
|
12-Nov-1995 |
bde |
Included <sys/sysproto.h> to get central declarations for syscall args structs and prototypes for syscalls.
Ifdefed duplicated decentralized declarations of args structs. It's convenient to have this visible but they are hard to maintain. Some are already different from the central declarations. 4.4lite2 puts them in comments in the function headers but I wanted to avoid the large changes for that.
|
#
11705 |
|
23-Oct-1995 |
dyson |
First phase of removing the PG_COPYONWRITE flag, and an architectural cleanup of mapping files.
|
#
11621 |
|
21-Oct-1995 |
dyson |
Implement mincore system call.
|
#
9507 |
|
13-Jul-1995 |
dg |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!!
Much needed overhaul of the VM system. Included in this first round of changes:
1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers".
2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items.
3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed.
4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug.
5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance.
6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain.
7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance.
8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed.
9) Some almost useless debugging code removed.
10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology.
11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended.
12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course).
13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE.
14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes)
TODO:
1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size.
2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness.
3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind.
4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems.
5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
#
9456 |
|
09-Jul-1995 |
dg |
Moved call to VOP_GETATTR() out of vnode_pager_alloc() and into the places that call vnode_pager_alloc() so that a failure return can be dealt with. This fixes a panic seen on NFS clients when a file being opened is deleted on the server before the open completes.
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
8585 |
|
18-May-1995 |
dg |
Accessing pages beyond the end of a mapped file results in internal inconsistencies in the VM system that eventually lead to a panic. These changes fix the behavior to conform to the behavior in SunOS, which is to deny faults to pages beyond the EOF (returning SIGBUS). Internally, this is implemented by requiring faults to be within the object size boundaries. These changes exposed another bug, namely that passing in an offset to mmap when trying to map an unnamed anonymous region also results in internal inconsistencies. In this case, the offset is forced to zero.
Reviewed by: John Dyson and others
|
#
7883 |
|
16-Apr-1995 |
dg |
Moved some zero-initialized variables into .bss. Made code intended to be called only from DDB #ifdef DDB. Removed some completely unused globals.
|
#
7366 |
|
25-Mar-1995 |
dg |
Fix logic bug I just introduced with the flags to msync().
|
#
7364 |
|
25-Mar-1995 |
dg |
Disallow both MS_ASYNC and MS_INVALIDATE flags being set at the same time in msync().
|
#
7360 |
|
25-Mar-1995 |
dg |
Added "flags" argument to msync, and implemented MS_ASYNC and MS_INVALIDATE. The MS_ASYNC flag doesn't current work, and MS_INVALIDATE will only toss out the pages in the address space (not all pages in the shadow chain).
|
#
7239 |
|
22-Mar-1995 |
dg |
Fixed bug in vm_mmap() where the object that is created in some cases was the wrong size. This is the likely cause of panics reported by Lars Fredriksen and Paul Richards related to a -1 blkno when paging via the swap_pager.
Submitted by: John Dyson
|
#
7215 |
|
21-Mar-1995 |
dg |
Disallow non page-aligned file offsets in vm_mmap(). We don't support this in either the high or low level parts of the VM system. Just return EINVAL in this case, just like SunOS does.
|
#
7209 |
|
21-Mar-1995 |
dg |
Fixed bug in the size == 0 case of msync() caused by a bogus return value check..
|
#
7170 |
|
19-Mar-1995 |
dg |
Removed redundant newlines that were in some panic strings.
|
#
7090 |
|
16-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
7017 |
|
12-Mar-1995 |
dg |
Fixed obsolete comment.
|
#
6944 |
|
07-Mar-1995 |
dg |
Fixed object reference count problem that occurred in the MAP_PRIVATE case after we rewrote vm_mmap(). Added some comments to make it easier to follow the reference counts.
|
#
6617 |
|
22-Feb-1995 |
dg |
Rewrote MAP_PRIVATE case of vm_mmap() - all of the COW portion of this routine was highly convoluted.
Submitted by: John Dyson
|
#
6585 |
|
20-Feb-1995 |
dg |
Deprecated remaining use of vm_deallocate. Deprecated vm_allocate_with_ pager(). Almost completely rewrote vm_mmap(); when John gets done with the bottom half, it will be a complete rewrite. Deprecated most use of vm_object_setpager(). Removed side effect of setting object persist in vm_object_enter and moved this into the pager(s). A few other cosmetic changes.
|
#
6435 |
|
15-Feb-1995 |
dg |
Don't bother calling pmap_create() when creating the temporary map. The whole COW section of vm_mmap() should be rewritten; the current implementation is highly convoluted.
|
#
5455 |
|
09-Jan-1995 |
dg |
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme.
vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering.
vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption.
vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs.
vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore.
machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme.
machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers.
Submitted by: John Dyson and David Greenman
|
#
3449 |
|
08-Oct-1994 |
phk |
Cosmetics: unused vars, ()'s, #include's &c &c to silence gcc. Reviewed by: davidg
|
#
2462 |
|
02-Sep-1994 |
dg |
Whoops, accidently left out some pieces of the munmapfd patch.
|
#
1885 |
|
06-Aug-1994 |
dg |
Enabled page table preloading of cached objects.
Submitted by: John Dyson
|
#
1827 |
|
04-Aug-1994 |
dg |
Integrated VM system improvements/fixes from FreeBSD-1.1.5.
|
#
1817 |
|
02-Aug-1994 |
dg |
Added $Id$
|
#
1549 |
|
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
1542 |
|
24-May-1994 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r1541, which included commits to RCS files with non-trunk default branches.
|
#
1541 |
|
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|