History log of /freebsd-current/sys/vm/vm_kern.h
Revision Date Author Comments
# 7a79d066 09-Apr-2024 Bojan Novković <bnovkov@FreeBSD.org>

vm: improve kstack_object pindex calculation to avoid pindex holes

This commit replaces the linear transformation of kernel virtual
addresses to kstack_object pindex values with a non-linear
scheme that circumvents physical memory fragmentation caused by
kernel stack guard pages. The new mapping scheme is used to
effectively "skip" guard pages and assign pindices for
non-guard pages in a contiguous fashion.

The new allocation scheme requires that all default-sized kstack KVAs
come from a separate, specially aligned region of the KVA space.
For this to work, this commited introduces a dedicated per-domain
kstack KVA arena used to allocate kernel stacks of default size.
The behaviour on 32-bit platforms remains unchanged due to a
significatly smaller KVA space.

Aside from fullfilling the requirements imposed by the new scheme, a
separate kstack KVA arena facilitates superpage promotion in the rest
of kernel and causes most kstacks to have guard pages at both ends.

Reviewed by: alc, kib, markj
Tested by: markj
Approved by: markj (mentor)
Differential Revision: https://reviews.freebsd.org/D38852


# 29363fb4 23-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by: Netflix


# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# 7dd979df 17-Aug-2020 Mark Johnston <markj@FreeBSD.org>

Remove the VM map zone.

Today, the zone is only used to allocate a trio of kernel maps: the
kernel map itself, and the exec and pipe submaps. Maps for user
processes are dynamically allocated but are embedded in the vmspace
structure, which is allocated from its own zone. Make the
aforementioned kernel maps statically allocated and get rid of the zone.

While here, remove a stale comment above vmspace_alloc() and change the
names of locks initialized in vm_map_init() to match vmspace_zinit().

Reported by: alc
Reviewed by: alc, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26052


# f13fa9df 26-Apr-2020 Mark Johnston <markj@FreeBSD.org>

Use a single VM object for kernel stacks.

Previously we allocated a separate VM object for each kernel stack.
However, fully constructed kernel stacks are cached by UMA, so there is
no harm in using a single global object for all stacks. This reduces
memory consumption and makes it easier to define a memory allocation
policy for kernel stack pages, with the aim of reducing physical memory
fragmentation.

Add a global kstack_object, and use the stack KVA address to index into
the object like we do with kernel_object.

Reviewed by: kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24473


# 46b0292a 16-Jan-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Do not reserve KVA for paging bufs in vm_ksubmap_init(), since now
they allocate it in pbuf_init(). This should have been done together
with r343030.


# 969e147a 20-Sep-2018 Mark Johnston <markj@FreeBSD.org>

Ensure that imports into per-domain kmem arenas are KVA_QUANTUM-aligned.

The old code appears to assume that vmem_alloc() would import
size-aligned KVA chunks from the parent kernel_arena, but vmem doesn't
provide this guarantee.

Also remove the unused global RWX arena and add comments explaining why
we have per-domain arenas.

Reported by: alc
Reviewed by: alc, kib (previous version)
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17249


# 0766f278 13-Jun-2018 Jonathan T. Looney <jtl@FreeBSD.org>

Make UMA and malloc(9) return non-executable memory in most cases.

Most kernel memory that is allocated after boot does not need to be
executable. There are a few exceptions. For example, kernel modules
do need executable memory, but they don't use UMA or malloc(9). The
BPF JIT compiler also needs executable memory and did use malloc(9)
until r317072.

(Note that a side effect of r316767 was that the "small allocation"
path in UMA on amd64 already returned non-executable memory. This
meant that some calls to malloc(9) or the UMA zone(9) allocator could
return executable memory, while others could return non-executable
memory. This change makes the behavior consistent.)

This change makes malloc(9) return non-executable memory unless the new
M_EXEC flag is specified. After this change, the UMA zone(9) allocator
will always return non-executable memory, and a KASSERT will catch
attempts to use the M_EXEC flag to allocate executable memory using
uma_zalloc() or its variants.

Allocations that do need executable memory have various choices. They
may use the M_EXEC flag to malloc(9), or they may use a different VM
interfact to obtain executable pages.

Now that malloc(9) again allows executable allocations, this change also
reverts most of r317072.

PR: 228927
Reviewed by: alc, kib, markj, jhb (previous version)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D15691


# 796df753 30-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

SPDX: Consider code from Carnegie-Mellon University.

Interesting cases, most likely from CMU Mach sources.


# 51369649 20-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.


# fbbd9655 28-Feb-2017 Warner Losh <imp@FreeBSD.org>

Renumber copyright clause 4

Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96


# ec492b13 04-Jan-2017 Mark Johnston <markj@FreeBSD.org>

Add a small allocator for exec_map entries.

Upon each execve, we allocate a KVA range for use in copying data to the
new image. Pages must be faulted into the range, and when the range is
freed, the backing pages are freed and their mappings are destroyed. This
is a lot of needless overhead, and the exec_map management becomes a
bottleneck when many CPUs are executing execve concurrently. Moreover, the
number of available ranges is fixed at 16, which is insufficient on large
systems and potentially excessive on 32-bit systems.

The new allocator reduces overhead by making exec_map allocations
persistent. When a range is freed, pages backing the range are marked clean
and made easy to reclaim. With this change, the exec_map is sized based on
the number of CPUs.

Reviewed by: kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D8921


# 8441d1e8 13-Aug-2013 Jeff Roberson <jeff@FreeBSD.org>

- Add a statically allocated memguard arena since it is needed very early
on.
- Pass the appropriate flags to vmem_xalloc() when allocating space for
the arena from kmem_arena.

Sponsored by: EMC / Isilon Storage Division


# 5df87b21 07-Aug-2013 Jeff Roberson <jeff@FreeBSD.org>

Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

- Normalize functions that allocate memory to use kmem_*
- Those that allocate address space are named kva_*
- Those that operate on maps are named kmap_*
- Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by: alc
Tested by: pho
Sponsored by: EMC / Isilon Storage Division


# 5f518366 27-Jun-2013 Jeff Roberson <jeff@FreeBSD.org>

- Add a general purpose resource allocator, vmem, from NetBSD. It was
originally inspired by the Solaris vmem detailed in the proceedings
of usenix 2001. The NetBSD version was heavily refactored for bugs
and simplicity.
- Use this resource allocator to allocate the buffer and transient maps.
Buffer cache defrags are reduced by 25% when used by filesystems with
mixed block sizes. Ultimately this may permit dynamic buffer cache
sizing on low KVA machines.

Discussed with: alc, kib, attilio
Tested by: pho
Sponsored by: EMC / Isilon Storage Division


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# b89eaf4e 05-Jul-2008 Alan Cox <alc@FreeBSD.org>

Enable the creation of a kmem map larger than 4GB.
Submitted by: Tz-Huan Huang

Make several variables related to kmem map auto-sizing static.
Found by: CScout


# 9bed18a4 20-Nov-2006 Ruslan Ermilov <ru@FreeBSD.org>

The clean_map has been made local to vm_init.c long ago.


# 60727d8b 06-Jan-2005 Warner Losh <imp@FreeBSD.org>

/* -> /*- for license, minor formatting changes


# 05eb3785 06-Apr-2004 Warner Losh <imp@FreeBSD.org>

Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core


# cebde069 10-Aug-2003 Mike Silbersack <silby@FreeBSD.org>

More pipe changes:

From alc:
Move pageable pipe memory to a seperate kernel submap to avoid awkward
vm map interlocking issues. (Bad explanation provided by me.)

From me:
Rework pipespace accounting code to handle this new layout, and adjust
our default values to account for the fact that we now have a solid
limit on allocations.

Also, remove the "maxpipes" limit, as it no longer has a purpose.
(The limit on kva usage solves the problem of having two many pipes.)


# 53b19636 15-Feb-2003 Alan Cox <alc@FreeBSD.org>

Move kernel_vm_end's declaration to pmap.h; add a comment regarding the
synchronization of access to kernel_vm_end.


# 03adb816 21-May-2002 Alan Cox <alc@FreeBSD.org>

o Make contigmalloc1() static.


# 08442f8a 22-Jun-2001 Bosko Milekic <bmilekic@FreeBSD.org>

Introduce numerous SMP friendly changes to the mbuf allocator. Namely,
introduce a modified allocation mechanism for mbufs and mbuf clusters; one
which can scale under SMP and which offers the possibility of resource
reclamation to be implemented in the future. Notable advantages:

o Reduce contention for SMP by offering per-CPU pools and locks.
o Better use of data cache due to per-CPU pools.
o Much less code cache pollution due to excessively large allocation macros.
o Framework for `grouping' objects from same page together so as to be able
to possibly free wired-down pages back to the system if they are no longer
needed by the network stacks.

Additional things changed with this addition:

- Moved some mbuf specific declarations and initializations from
sys/conf/param.c into mbuf-specific code where they belong.
- m_getclr() has been renamed to m_get_clrd() because the old name is really
confusing. m_getclr() HAS been preserved though and is defined to the new
name. No tree sweep has been done "to change the interface," as the old
name will continue to be supported and is not depracated. The change was
merely done because m_getclr() sounds too much like "m_get a cluster."
- TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and
systat(1) (see TODO below).
- Fixed systat(1) to display number of "free mbufs" based on new per-CPU
stat structures.
- Fixed netstat(1) to display new per-CPU stats based on sysctl-exported
per-CPU stat structures. All infos are fetched via sysctl.

TODO (in order of priority):

- Re-enable mbtypes statistics in both netstat(1) and systat(1) after
introducing an SMP friendly way to collect the mbtypes stats under the
already introduced per-CPU locks (i.e. hopefully don't use atomic() - it
seems too costly for a mere stat update, especially when other locks are
already present).
- Optionally have systat(1) display not only "total free mbufs" but also
"total free mbufs per CPU pool."
- Fix minor length-fetching issues in netstat(1) related to recently
re-enabled option to read mbuf stats from a core file.
- Move reference counters at least for mbuf clusters into an unused portion
of the cluster itself, to save space and need to allocate a counter.
- Look into introducing resource freeing possibly from a kproc.

Reviewed by (in parts): jlemon, jake, silby, terry
Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha)
Preliminary performance measurements: jlemon (and me, obviously)
URL: http://people.freebsd.org/~bmilekic/mb_alloc/


# 1f6889a1 16-Feb-2000 Matthew Dillon <dillon@FreeBSD.org>

Fix null-pointer dereference crash when the system is intentionally
run out of KVM through a mmap()/fork() bomb that allocates hundreds
of thousands of vm_map_entry structures.

Add panic to make null-pointer dereference crash a little more verbose.

Add a new sysctl, vm.max_proc_mmap, which specifies the maximum number
of mmap()'d spaces (discrete vm_map_entry's in the process). The value
defaults to around 9000 for a 128MB machine. The test is scaled for the
number of processes sharing a vmspace (aka linux threads). Setting
the value to 0 disables the feature.

PR: kern/16573
Approved by: jkh


# 02577fa2 28-Oct-1999 Alan Cox <alc@FreeBSD.org>

Remove the last vestiges of "vm_map_t phys_map". It's been unused
since i386/i386/machdep.c rev 1.45 (or 1994 :-) ).


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# 76782487 15-Aug-1999 Alan Cox <alc@FreeBSD.org>

Remove the declarations for "vm_map_t io_map". It's been unused
since i386/i386/machdep rev 1.310, i.e., the demise of BOUNCE_BUFFERS.


# aecb0ebb 15-Aug-1999 Alan Cox <alc@FreeBSD.org>

Remove the declarations for "vm_map_t u_map". It's been unused
since i386/i386/pmap rev 1.190. (The alpha never used it.)


# 3820ec1d 11-Oct-1997 Peter Wemm <peter@FreeBSD.org>

Attempt to fix the previous fix to the contigmalloc1 prototype.
struct malloc_type isn't defined in all cases (eg: from ddb), and the line
wrapping was very badly mangled.


# f0d45e6a 10-Oct-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Fix contigmalloc() and contigmalloc1() arguments.


# 3b18caba 22-Jun-1997 Peter Wemm <peter@FreeBSD.org>

Kill some stale leftovers from the earlier attempts at SMP per-cpu pages


# 477a642c 26-Apr-1997 Peter Wemm <peter@FreeBSD.org>

Man the liferafts! Here comes the long awaited SMP -> -current merge!

There are various options documented in i386/conf/LINT, there is more to
come over the next few days.

The kernel should run pretty much "as before" without the options to
activate SMP mode.

There are a handful of known "loose ends" that need to be fixed, but
have been put off since the SMP kernel is in a moderately good condition
at the moment.

This commit is the result of the tinkering and testing over the last 14
months by many people. A special thanks to Steve Passe for implementing
the APIC code!


# 9caaadb6 31-Mar-1997 David Greenman <dg@FreeBSD.org>

Changed the way that the exec image header is read to be filesystem-
centric rather than VM-centric to fix a problem with errors not being
detectable when the header is read.
Killed exech_map as a result of these changes.
There appears to be no performance difference with this change.


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 10825343 13-Feb-1997 Garrett Wollman <wollman@FreeBSD.org>

Provide an alternative interface to contigmalloc() which allows a specific
map to be used when allocating the kernel va (e.g., mb_map). The VM
gurus may want to look this over.


# 649c409d 15-Jan-1997 David Greenman <dg@FreeBSD.org>

Fix bug related to map entry allocations where a sleep might be attempted
when allocating memory for network buffers at interrupt time. This is due
to inadequate checking for the new mcl_map. Fixed by merging mb_map and
mcl_map into a single mb_map.

Reviewed by: wollman


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 1b4435b8 17-May-1996 John Dyson <dyson@FreeBSD.org>

File mistakenly left out of the previous mega-commit. This provides
a global defn for 'exech_map.'


# cb7545a9 10-May-1996 Garrett Wollman <wollman@FreeBSD.org>

Allocate mbufs from a separate submap so that NMBCLUSTERS works as
expected.


# 28f8db14 29-Jul-1995 Bruce Evans <bde@FreeBSD.org>

Eliminate sloppy common-style declarations. There should be none left for
the LINT configuation.


# 0d94caff 09-Jan-1995 David Greenman <dg@FreeBSD.org>

These changes embody the support of the fully coherent merged VM buffer cache,
much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.

The majority of the merged VM/cache work is by John Dyson.

The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.

vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme. The scheme is almost fully compatible with the old filesystem
interface. Significant improvement in the number of opportunities for write
clustering.

vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.

vm_object.c:
Yet more improvements in the collapse code. Elimination of some windows that
can cause list corruption.

vm_pageout.c:
Fixed it, it really works better now. Somehow in 2.0, some "enhancements"
broke the code. This code has been reworked from the ground-up.

vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.

pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.

vm_glue.c
Much simpler and more effective swapping code. No more gratuitous swapping.

proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.

swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the
code doesn't need it anymore.

machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.

machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.

ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on
busy buffers.

Submitted by: John Dyson and David Greenman


# f23b4c91 18-Aug-1994 Garrett Wollman <wollman@FreeBSD.org>

Fix up some sloppy coding practices:

- Delete redundant declarations.
- Add -Wredundant-declarations to Makefile.i386 so they don't come back.
- Delete sloppy COMMON-style declarations of uninitialized data in
header files.
- Add a few prototypes.
- Clean up warnings resulting from the above.

NB: ioconf.c will still generate a redundant-declaration warning, which
is unavoidable unless somebody volunteers to make `config' smarter.


# 3c4dd356 02-Aug-1994 David Greenman <dg@FreeBSD.org>

Added $Id$


# 26f9a767 25-May-1994 Rodney W. Grimes <rgrimes@FreeBSD.org>

The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.

Reviewed by: Rodney W. Grimes
Submitted by: John Dyson and David Greenman


# df8bae1d 24-May-1994 Rodney W. Grimes <rgrimes@FreeBSD.org>

BSD 4.4 Lite Kernel Sources