History log of /freebsd-10.1-release/sys/arm/arm/pmap-v6.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 272461 02-Oct-2014 gjb

Copy stable/10@r272459 to releng/10.1 as part of
the 10.1-RELEASE process.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 270920 01-Sep-2014 kib

Fix a leak of the wired pages when unwiring of the PROT_NONE-mapped
wired region. Rework the handling of unwire to do the it in batch,
both at pmap and object level.

All commits below are by alc.

MFC r268327:
Introduce pmap_unwire().

MFC r268591:
Implement pmap_unwire() for powerpc.

MFC r268776:
Implement pmap_unwire() for arm.

MFC r268806:
pmap_unwire(9) man page.

MFC r269134:
When unwiring a region of an address space, do not assume that the
underlying physical pages are mapped by the pmap. This fixes a leak
of the wired pages on the unwiring of the region mapped with no access
allowed.

MFC r269339:
In the implementation of the new function pmap_unwire(), the call to
MOEA64_PVO_TO_PTE() must be performed before any changes are made to the
PVO. Otherwise, MOEA64_PVO_TO_PTE() will panic.

MFC r269365:
Correct a long-standing problem in moea{,64}_pvo_enter() that was revealed
by the combination of r268591 and r269134: When we attempt to add the
wired attribute to an existing mapping, moea{,64}_pvo_enter() do nothing.
(They only set the wired attribute on newly created mappings.)

MFC r269433:
Handle wiring failures in vm_map_wire() with the new functions
pmap_unwire() and vm_object_unwire().
Retire vm_fault_{un,}wire(), since they are no longer used.

MFC r269438:
Rewrite a loop in vm_map_wire() so that gcc doesn't think that the variable
"rv" is uninitialized.

MFC r269485:
Retire pmap_change_wiring().

Reviewed by: alc


# 270439 24-Aug-2014 kib

Merge the changes to pmap_enter(9) for sleep-less operation (requested
by flag). The ia64 pmap.c changes are direct commit, since ia64 is
removed on head.

MFC r269368 (by alc):
Retire PVO_EXECUTABLE.

MFC r269728:
Change pmap_enter(9) interface to take flags parameter and superpage
mapping size (currently unused).

MFC r269759 (by alc):
Update the text of a KASSERT() to reflect the changes in r269728.

MFC r269822 (by alc):
Change {_,}pmap_allocpte() so that they look for the flag
PMAP_ENTER_NOSLEEP instead of M_NOWAIT/M_WAITOK when deciding whether
to sleep on page table page allocation.

MFC r270151 (by alc):
Replace KASSERT that no PV list locks are held with a conditional
unlock.

Reviewed by: alc
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation


# 269103 25-Jul-2014 ian

MFC r266565, r266651:

Map device memory using PTE_DEVICE attributes, and also ensure that the
shared flag is set on normal-memory mappings made via pmap_kenter() for SMP.

The "shared flag" part of this change isn't obvious from the diff, here's
the deal... by using the array of preformatted page table entry templates
instead of constructing the PTE from scratch, we automatically get the
right attribute bits set for both caching and shared.

Fix whitespace glitches.


# 269072 24-Jul-2014 kib

MFC r267213 (by alc):
Add a page size field to struct vm_page.

Approved by: alc


# 266412 18-May-2014 ian

MFC 258287: Implement pmap_align_superpage().


# 266369 17-May-2014 ian

MFC 264702: Remove uncessary armv6 cache and TLB maintenance ops.


# 266357 17-May-2014 ian

MFC 264183: Add a couple more required TLB flushes.


# 266353 17-May-2014 ian

MFC 264128, 264129, 264130, 264135,

Fix TTB set operation for armv7. Perform sychronization (by "isb" barrier)
after TTB is set.

Fix TLB maintenance issues for armv6 and armv7.

- Add cpu_cpwait to comply with the convention.
- Add missing TLB invalidations, especially in pmap_kenter & pmap_kremove
with distinguishing between D and ID pages.
- Modify pmap init/bootstrap invalidations to ID, just to be safe.
- Fix TLB-inv and PTE_SYNC ordering.

Allocate per-cpu resources for doing pmap_zero_page() and pmap_copy_page().
This is performance enhancement rather than bugfix.

We don't support any ARM systems with an ISA bus and don't need a freelist
of memory to support ISA addressing limitations.


# 266199 15-May-2014 ian

MFC r261917, r261918, r261919, r261920, r261921, r261922

Always clear L1 PTE descriptor when removing superpage on ARM

Invalidate L1 PTE regardles of existance of the corresponding l2_bucket.

Ensure proper TLB invalidation on superpage promotion and demotion on ARM

Base pages within newly created superpage need to be invalidated so that
new mapping is "visible" immediately after creation.

Fix superpage promotion on ARM with respect to RO/RW and wired attributes

Avoid redundant superpage promotion attempts on ARM

Remove spurious assertion from pmap_extract_locked() on ARM

Handle pmap_enter() on already promoted mappings for ARMv6/v7


# 266160 15-May-2014 ian

MFC r261423, r261424, r261516, r261513, r261562, r261563, r261564, r261565,
r261596, r261606

Add the imx sdhci controller.

Move Open Firmware device root on PowerPC, ARM, and MIPS systems to
a sub-node of nexus (ofwbus) rather than direct attach under nexus. This
fixes FDT on x86 and will make coexistence with ACPI on ARM systems easier.
SPARC is unchanged.

Add the missing ')' at end of sentence. Reword it to use a more common idiom.

Pass the kernel physical address to initarm through the boot param struct.

Make functions only used in vfp.c static, and remove vfp_enable.

Fix __syscall on armeb EABI. As it returns a 64-bit value it needs to
place 32-bit data in r1, not r0. 64-bit data is already packed correctly.

Use abp_physaddr for the physical address over KERNPHYSADDR. This helps us
remove the need to load the kernel at a fixed address.

Remove references to PHYSADDR where it's used only in debugging output.

Dynamically generate the page table. This will allow us to detect the
physical address we are loaded at to change the mapping.


# 266058 14-May-2014 ian

MFC r258359, r258742, r258845, r259936, r259640

Apply access flags for managed and unmanaged pages properly on ARMv6/v7

Set the PGA_WRITEABLE flag when the protections indicate write access, not
just when the current access is a write.

Enable missing Access Flag for secondary cores on ARMv6/v7

Add identification and necessary type checks for Krait CPU cores.


# 266050 14-May-2014 ian

MFC r256707, r256708, r257291, r258358

Switch to use WBWA mappings for page tables on armv6, this is needed for SMP.
Fix PTE_SYNC() for PIPT L2 caches, using the virtual address wasn't so useful.
Use PTE_SYNC() for >= armv6
Spell cpu_l2cache_wb_range correctly.

Fix condition that determines PMAP_NEEDS_PTE_SYNC value for ARM

Use values of the correct defines to determine statement's result.
ARM_ARCH_ symbols are always defined, hence only values are relevant.

Avoid clearing EXEC permission bit when setting the page RW on ARMv6/v7

When emulating modified bit the executable attribute was cleared by
mistake when calling pmap_set_prot().


# 259963 27-Dec-2013 adrian

Revert r252694 from stable/10 to fix instabilities seen with jemalloc + dhclient/sshd.

This is a direct commit to stable/10 as the VM code has changed
since the stable/10 branch.

PR: kern/185046


# 259364 13-Dec-2013 ian

MFC r257648, r257649, r257660:

Begin reducing code duplication in arm pmap.c and pmap-v6.c by factoring
out common code related to mapping device memory into a new devmap.c file.

Remove the growing duplication of code that used pmap_devmap_find_pa() and
then did some math with the returned results to generate a virtual address,
and likewise in reverse to get a physical address. Now there are a pair
of functions, arm_devmap_vtop() and arm_devmap_ptov(), to do that. The
bus_space_map() implementations are rewritten in terms of these.

Move remaining code and data related to static device mapping into the
new devmap.[ch] files. Emphasize the MD nature of these things by using
the prefix arm_devmap_ on the function and type names (already a few of
these things found their way into MI code, hopefully it will be harder to
do by accident in the future).


# 259335 13-Dec-2013 ian

MFC r257201, r257202

Retire arm_remap_nocache() and the data and constants associated with it.


# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 255724 20-Sep-2013 alc

The pmap function pmap_clear_reference() is no longer used. Remove it.

pmap_clear_reference() has had exactly one caller in the kernel for
several years, more precisely, since FreeBSD 8. Now, that call no
longer exists.

Approved by: re (kib)
Sponsored by: EMC / Isilon Storage Division


# 255612 16-Sep-2013 zbb

Implement pmap_advise() for ARMv6/v7 pmap module

Apply the given advice to the specified range of addresses within the
given pmap. Depending on the advice, clear the referenced and/or
modified flags in each mapping. Superpage within the given range will
be demoted or destroyed.

Reviewed by: alc
Approved by: cognet (mentor)
Approved by: re


# 255611 16-Sep-2013 zbb

Write protect base page after superpage demotion so that it may repromote

When clearing the modification status of the superpage, one of the
base pages produced during demotion should be marked as write disabled.
The intention is that subsequent write access may repromote.
In the current implementation this was done wrong as write permission was
granted instead of forbidden.

Approved by: cognet (mentor)
Approved by: re


# 255028 29-Aug-2013 alc

Significantly reduce the cost, i.e., run time, of calls to madvise(...,
MADV_DONTNEED) and madvise(..., MADV_FREE). Specifically, introduce a new
pmap function, pmap_advise(), that operates on a range of virtual addresses
within the specified pmap, allowing for a more efficient implementation of
MADV_DONTNEED and MADV_FREE. Previously, the implementation of
MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as
pmap_clear_reference(). Intuitively, the problem with this implementation
is that the pmap-level locks are acquired and released and the page table
traversed repeatedly, once for each resident page in the range
that was specified to madvise(2). A more subtle flaw with the previous
implementation is that pmap_clear_reference() would clear the reference bit
on all mappings to the specified page, not just the mapping in the range
specified to madvise(2).

Since our malloc(3) makes heavy use of madvise(2), this change can have a
measureable impact. For example, the system time for completing a parallel
"buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%.

Note: This change only contains pmap_advise() implementations for a subset
of our supported architectures. I will commit implementations for the
remaining architectures after further testing. For now, a stub function is
sufficient because of the advisory nature of pmap_advise().

Discussed with: jeff, jhb, kib
Tested by: pho (i386), marcel (ia64)
Sponsored by: EMC / Isilon Storage Division


# 254918 26-Aug-2013 raj

Introduce superpages support for ARMv6/v7.

Promoting base pages to superpages can increase TLB coverage and allow for
efficient use of page table entries. This development provides FreeBSD/ARM
with superpages management mechanism roughly equivalent to what we have for
i386 and amd64 architectures.

1. Add mechanism for automatic promotion of 4KB page mappings to 1MB section
mappings (and demotion when not needed, respectively).

2. Managed and non-kernel mappings are now superpages-aware.

3. The functionality can be enabled by setting "vm.pmap.sp_enabled" tunable to
a non-zero value (either in loader.conf or by modifying "sp_enabled"
variable in pmap-v6.c file). By default, automatic promotion is currently
disabled.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: alc
Sponsored by: The FreeBSD Foundation, Semihalf


# 254913 26-Aug-2013 raj

Add missing TAILQ initializer (omitted in r250634).

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: alc
Sponsored by: The FreeBSD Foundation, Semihalf


# 254901 26-Aug-2013 andrew

Revert r251370 as it contains a deadlock.


# 254667 22-Aug-2013 kib

Revert r254501. Instead, reuse the type stability of the struct pmap
which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE
zone. Initialize the pmap lock in the vmspace zone init function, and
remove pmap lock initialization and destruction from pmap_pinit() and
pmap_release().

Suggested and reviewed by: alc (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation


# 254536 19-Aug-2013 raj

Do not use pv_kva on ARMv6/v7 and save some space on each vm_page. It's only
relevant for older ARM variants (with virtual cache).

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf


# 254535 19-Aug-2013 raj

Simplify and clean up pmap_clearbit()

There is no need for calling vm_page_dirty() when clearing "modified" flag as
it is already set for that page in pmap_fault_fixup() or pmap_enter() thanks
to "modified" bit emulation.

Also, there is no need for checking PTE "referenced" or "writeable" flags. If
there is a request to clear a particular flag we should just do it.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf


# 254533 19-Aug-2013 raj

Fix ARMv6/v7 mapping's wired status.

Last input argument in pmap_modify_pv() should be a mask of flags to be set.
In pmap_change_wiring() however, the straight wired status was used, which
does not represent valid flags (and is of type boolean).

This commit fixes the issue so that wired flag is passed to pmap_modify_pv()
properly.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf


# 254532 19-Aug-2013 raj

Clear all L2 PTE protection bits before their configuration.

Revise L2_S_PROT_MASK to include all of the protection bits. Notice that
clearing these bits does not always take away the corresponding permissions
(for example, permission is granted when the bit is cleared). The bits are
cleared but are to be set or left cleared accordingly in pmap_set_prot(),
pmap_enter_locked(), etc.

Clear L2_XN along with L2_S_PROT_MASK in pmap_set_prot() so that all
permissions related bits are cleared before actual configuration.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf


# 254531 19-Aug-2013 raj

Simplify pv_entry removal or ARMv6/v7:

- PGA_WRITEABLE indicates that there *might be* a writable mapping for the
particular page, so to avoid frequent sweeping of the pv_entries whenever
pmap_nuke_pv(), pmap_modify_pv(), etc. is called, it is sufficient to
clear that flag if there are no managed mappings for that page anymore
(notice that only pmap_enter is authorized to set this flag).
- Avoid redundant checking for PVF_WIRED flag when this flag cannot be set
anyway.
- Clear PGA_WRITEABLE only once for each vm_page instead of multiple,
redundant clearing it in loop when there are no writeable mappings
to that page anymore.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf


# 254138 09-Aug-2013 attilio

The soft and hard busy mechanism rely on the vm object lock to work.
Unify the 2 concept into a real, minimal, sxlock where the shared
acquisition represent the soft busy and the exclusive acquisition
represent the hard busy.
The old VPO_WANTED mechanism becames the hard-path for this new lock
and it becomes per-page rather than per-object.
The vm_object lock becames an interlock for this functionality:
it can be held in both read or write mode.
However, if the vm_object lock is held in read mode while acquiring
or releasing the busy state, the thread owner cannot make any
assumption on the busy state unless it is also busying it.

Also:
- Add a new flag to directly shared busy pages while vm_page_alloc
and vm_page_grab are being executed. This will be very helpful
once these functions happen under a read object lock.
- Move the swapping sleep into its own per-object flag

The KPI is heavilly changed this is why the version is bumped.
It is very likely that some VM ports users will need to change
their own code.

Sponsored by: EMC / Isilon storage division
Discussed with: alc
Reviewed by: jeff, kib
Tested by: gavin, bapt (older version)
Tested by: pho, scottl


# 254025 07-Aug-2013 jeff

Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

- Normalize functions that allocate memory to use kmem_*
- Those that allocate address space are named kva_*
- Those that operate on maps are named kmap_*
- Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by: alc
Tested by: pho
Sponsored by: EMC / Isilon Storage Division


# 253052 09-Jul-2013 emaste

Remove extraneous format string converison specifier

Submitted by: wxs@


# 252695 04-Jul-2013 gber

Remove redundant clearing of the PGA_WRITEABLE flag in
pmap_remove_all()

This flag should already be cleared by pmap_nuke_pv()

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf


# 252694 04-Jul-2013 gber

Fix modified bit emulation for ARMv6/v7

When doing pmap_enter_locked(), enable write permission only when access
type indicates attempt to write. Otherwise, leave the page read only but
mark it writable in pv_flags.

This will result in:
1. Marking page writable during pmap_enter() but only when ensured that it
will be written right away so that we will not get redundant permissions
fault on write attempt.
2. Keeping page read only when it is permitted to be written but there was
no actual write attempt. Hence, we will get permissions fault on write
access and mark page writable in pmap_fault_fixup() what will indicate
modification status.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf


# 251370 04-Jun-2013 gber

Implement pmap_copy() for ARMv6/v7.

Copy the given range of mappings from the source map to the
destination map, thereby reducing the number of VM faults on fork.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf


# 250931 23-May-2013 gber

Rework and organize pmap_enter_locked() function.

pmap_enter_locked() implementation was very ambiguous and confusing.
Rearrange it so that each part of the mapping creation is separated.
Avoid walking through the redundant conditions.
Extract vector_page specific PTE setup from normal PTE setting.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf


# 250930 23-May-2013 gber

Stop using PVF_MOD, PVF_REF & PVF_EXEC flags in pv_entry, use PTE.

Using PVF_MOD, PVF_REF and PVF_EXEC is redundant as we can get the proper
info from PTE bits.
When the mapping is marked as executable and has been referenced we assume
that it has been executed. Similarly, when the mapping is set to be writable
and is referenced, it must have been due to write access to it.
PVF_MOD and PVF_REF flags are kept just for pmap_clearbit() usage,
to pass the information on which bit should be cleared.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf


# 250929 23-May-2013 gber

Improve, optimize and clean-up ARMv6/v7 memory management related code.

Use pmap_find_pv if needed instead of multiplying its code throughout
pmap-v6.

Avoid possible NULL pointer dereference in pmap_enter_locked()
When trying to get m->md.pv_memattr, make sure that m != NULL,
in particular that vector_page is set to be NULL.

Do not set PGA_REFERENCED flag in pmap_enter_pv().
On ARM any new page reference will result in either entering the new
mapping by calling pmap_enter, etc. or fixing-up the existing mapping in
pmap_fault_fixup().
Therefore we set PGA_REFERENCED flag in the earlier mentioned cases and
setting it later in pmap_enter_pv() is just waste of cycles.

Delete unused pm_pdir pointer from the pmap structure.

Rearrange brackets in the fault cause detection in trap.c
Place the brackets correctly in order to see course of the conditions
instantaneously.

Unify naming in pmap-v6.c and improve style
Use naming common for whole pmap and compatible with other pmaps,
improve style where possible:
pm -> pmap
pg -> m
opg -> om
*pt -> *ptep
*pte -> *ptep
*pde -> *pdep

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf


# 250928 23-May-2013 gber

Switch to AP[2:1] access permissions model. Store "referenced"
bit in PTE.

Enable Access Flag in CPU control. With AF enabled each valid mapping
needs to have referenced bit in PTE set in order to be able to cache
it in the TLB.

AP[0] bit is to be used as reference flag.
All access permissions are encoded by AP[2:1] wherein AP[1] is in fact
"user enable" and AP[2](APX) is "write disable".

All mappings are always set to be valid. Reference emulation is performed
by setting/clearing reference flag in PTE.

md.pvh_attrs are no longer necessary however pv_flags are still being used
for now.

Marking vm_page as "dirty" or "referenced" is being performed on:
- page or flag fault servicing in pmap_fault_fixup(), basing on the fault
type
- vm_fault servicing in pmap_enter() according to the desired protections
and faulty access type
Redundant page marking has been removed as on ARM we know exactly when the
particular page is referenced or is going to be written.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf


# 250884 21-May-2013 attilio

o Relax locking assertions for vm_page_find_least()
o Relax locking assertions for pmap_enter_object() and add them also
to architectures that currently don't have any
o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade
operation on the per-object rwlock
o Use all the mechanisms above to make vm_map_pmap_enter() to work
mostl of the times only with readlocks.

Sponsored by: EMC / Isilon storage division
Reviewed by: alc


# 250634 14-May-2013 gber

Port the new PV entry allocator from amd64/i386/mips to armv6/v7.

PV entries are now roughly half the size.
Instead of using a shared UMA zone for 28 byte pv entries
(two 8-byte tailq nodes, a 4 byte pointer, a 4 byte address and 4 byte
flags), we allocate a page at a time per process.
This provides 252 pv entries per process (actually, per pmap address space)
and eliminates one of the 8-byte tailq entries since we now can track
per-process pv entries implicitly.
The pointer to the pmap can be eliminated by doing address arithmetic to
find the metadata on the page headers to find a single pointer shared by
all 252 entries. There is an 8-int bitmap for the freelist of those 252
entries.
When in serious low memory condition, allocation of another pv_chunk is
possible by freeing some pages in pmap_pv_reclaim().

Added pv_entry/pv_chunk related statistics to pmap.
pv_entry/pv_chunk statistics can be accessed via sysctl vm.pmap.

Ported PTE freelist of KVA allocation and maintenance from i386.
Using an idea from Stephan Uphoff, use the empty pte's that correspond
to the unused kva in the pv memory block to thread a freelist through.
This allows us to free pages that used to be used for pv entry chunks
since we can now track holes in the kva memory block.

As both ARM pmap.c and pmap-v6.c use the same header and pv_entry, pmap and
md_page structures are different, it was needed to separate code designed
for ARMv6/7 from the one for other ARMs.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: alc
Sponsored by: The FreeBSD Foundation, Semihalf


# 250299 06-May-2013 gber

Fix page reference emulation on ARMv6 and v7

Submitted by: Zbigniew Bodek
Obtained from: Semihalf


# 250297 06-May-2013 gber

Fix L2 PTE access permissions management.

Keep following access permissions:

APX AP Kernel User
1 01 R N
1 10 R R
0 01 R/W N
0 11 R/W R/W

Avoid using reserved in ARMv6 APX|AP settings:
- In case of unprivileged (user) access without permission to write,
the access permission bits were being set to reserved for ARMv6
(but valid for ARMv7) value of APX|AP = 111.

Fix-up faulting userland accesses properly:
- Wrong condition statement in pmap_fault_fixup() caused that
any genuine, unprivileged access was being fixed-up instead of
just skip doing anything and return. Staring from now we ensure
proper reaction for illicit user accesses.

L2_S_PROT_R and L2_S_PROT_U names might be misleading as they do not
reflect real permission levels. It will be clarified in following
patches (switch to AP[2:1] permissions model).

Obtained from: Semihalf


# 248508 19-Mar-2013 kib

Implement the concept of the unmapped VMIO buffers, i.e. buffers which
do not map the b_pages pages into buffer_map KVA. The use of the
unmapped buffers eliminate the need to perform TLB shootdown for
mapping on the buffer creation and reuse, greatly reducing the amount
of IPIs for shootdown on big-SMP machines and eliminating up to 25-30%
of the system time on i/o intensive workloads.

The unmapped buffer should be explicitely requested by the GB_UNMAPPED
flag by the consumer. For unmapped buffer, no KVA reservation is
performed at all. The consumer might request unmapped buffer which
does have a KVA reserve, to manually map it without recursing into
buffer cache and blocking, with the GB_KVAALLOC flag.

When the mapped buffer is requested and unmapped buffer already
exists, the cache performs an upgrade, possibly reusing the KVA
reservation.

Unmapped buffer is translated into unmapped bio in g_vfs_strategy().
Unmapped bio carry a pointer to the vm_page_t array, offset and length
instead of the data pointer. The provider which processes the bio
should explicitely specify a readiness to accept unmapped bio,
otherwise g_down geom thread performs the transient upgrade of the bio
request by mapping the pages into the new bio_transient_map KVA
submap.

The bio_transient_map submap claims up to 10% of the buffer map, and
the total buffer_map + bio_transient_map KVA usage stays the
same. Still, it could be manually tuned by kern.bio_transient_maxcnt
tunable, in the units of the transient mappings. Eventually, the
bio_transient_map could be removed after all geom classes and drivers
can accept unmapped i/o requests.

Unmapped support can be turned off by the vfs.unmapped_buf_allowed
tunable, disabling which makes the buffer (or cluster) creation
requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped
buffers are only enabled by default on the architectures where
pmap_copy_page() was implemented and tested.

In the rework, filesystem metadata is not the subject to maxbufspace
limit anymore. Since the metadata buffers are always mapped, the
buffers still have to fit into the buffer map, which provides a
reasonable (but practically unreachable) upper bound on it. The
non-metadata buffer allocations, both mapped and unmapped, is
accounted against maxbufspace, as before. Effectively, this means that
the maxbufspace is forced on mapped and unmapped buffers separately.
The pre-patch bufspace limiting code did not worked, because
buffer_map fragmentation does not allow the limit to be reached.

By Jeff Roberson request, the getnewbuf() function was split into
smaller single-purpose functions.

Sponsored by: The FreeBSD Foundation
Discussed with: jeff (previous version)
Tested by: pho, scottl (previous version), jhb, bf
MFC after: 2 weeks


# 248280 14-Mar-2013 kib

Add pmap function pmap_copy_pages(), which copies the content of the
pages around, taking array of vm_page_t both for source and
destination. Starting offsets and total transfer size are specified.

The function implements optimal algorithm for copying using the
platform-specific optimizations. For instance, on the architectures
were the direct map is available, no transient mappings are created,
for i386 the per-cpu ephemeral page frame is used. The code was
typically borrowed from the pmap_copy_page() for the same
architecture.

Only i386/amd64, powerpc aim and arm/arm-v6 implementations were
tested at the time of commit. High-level code, not committed yet to
the tree, ensures that the use of the function is only allowed after
explicit enablement.

For sparc64, the existing code has known issues and a stab is added
instead, to allow the kernel linking.

Sponsored by: The FreeBSD Foundation
Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6)
MFC after: 2 weeks


# 248084 09-Mar-2013 attilio

Switch the vm_object mutex to be a rwlock. This will enable in the
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.

The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
- VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
- VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
- VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
- VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
(in order to avoid visibility of implementation details)
- The read-mode operations are added:
VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
sys/mutex.h in consumers directly to cater its inlining functions
using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
the compat layer because the name clash between FreeBSD and solaris
versions must be avoided.
At this purpose zfs redefines the vm_object locking functions
directly, isolating the FreeBSD components in specific compat stubs.

The KPI results heavilly broken by this commit. Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).

Sponsored by: EMC / Isilon storage division
Reviewed by: jeff
Reviewed by: pjd (ZFS specific review)
Discussed with: alc
Tested by: pho


# 247360 26-Feb-2013 attilio

Merge from vmc-playground branch:
Replace the sub-optimal uma_zone_set_obj() primitive with more modern
uma_zone_reserve_kva(). The new primitive reserves before hand
the necessary KVA space to cater the zone allocations and allocates pages
with ALLOC_NOOBJ. More specifically:
- uma_zone_reserve_kva() does not need an object to cater the backend
allocator.
- uma_zone_reserve_kva() can cater M_WAITOK requests, in order to
serve zones which need to do uma_prealloc() too.
- When possible, uma_zone_reserve_kva() uses directly the direct-mapping
by uma_small_alloc() rather than relying on the KVA / offset
combination.

The removal of the object attribute allows 2 further changes:
1) _vm_object_allocate() becomes static within vm_object.c
2) VM_OBJECT_LOCK_INIT() is removed. This function is replaced by
direct calls to mtx_init() as there is no need to export it anymore
and the calls aren't either homogeneous anymore: there are now small
differences between arguments passed to mtx_init().

Sponsored by: EMC / Isilon storage division
Reviewed by: alc (which also offered almost all the comments)
Tested by: pho, jhb, davide


# 247046 20-Feb-2013 alc

Initialize vm_max_kernel_address on non-FDT platforms. (This should have
been included in r246926.)

The second parameter to pmap_bootstrap() is redundant. Eliminate it.

Reviewed by: andrew


# 246926 17-Feb-2013 alc

On arm, like sparc64, the end of the kernel map varies from one type of
machine to another. Therefore, VM_MAX_KERNEL_ADDRESS can't be a constant.
Instead, #define it to be a variable, vm_max_kernel_address, just like we
do on sparc64.

Reviewed by: kib
Tested by: ian


# 245146 08-Jan-2013 gonzo

Fix cache-related issue with pmap for ARMv6/ARMv7:

- Missing PTE_SYNC in pmap_kremove caused memory corruption
in userland applications
- Fix lack of cache flushes when using special PTEs for zeroing or
copying pages. If there are dirty lines for destination memory
and page later remapped as a non-cached region actual content
might be overwritten by these dirty lines when cache eviction
happens as a result of applying cache eviction policy or because
of wbinv_all call.
- icache sync for new mapping for userland applications.

Tested by: gber


# 244574 21-Dec-2012 cognet

The VM_MEMATTR_ constants are enumerated, not a bitset. Compare accordingly.

Submitted by: Ian Lepore <freebsd@damnhippie.dyndns.org>


# 244414 18-Dec-2012 cognet

Properly implement pmap_[get|set]_memattr

Submitted by: Ian Lepore <freebsd@damnhippie.dyndns.org>


# 243109 15-Nov-2012 cognet

Don't forget to unlock the pmap lock on failure.


# 241063 30-Sep-2012 alc

Stop calling pmap_remove_write() from pmap_remove_all(). Doing so is not
only inefficient but also leads to recursive lock acquisition.

Tested by: ray


# 241055 29-Sep-2012 alc

Eliminate unused variables.


# 241054 29-Sep-2012 alc

Add support for mincore(). Specifically, this is an adaptation of the
pmap_mincore() implementation that was added to the original arm pmap
in r235717.


# 240983 27-Sep-2012 alc

Implementing pmap_kextract(va) as pmap_extract(kernel_pmap, va) is
problematic because some callers to pmap_kextract() expect its
implementation to be lock-less. In particular, uma_dbg_alloc() implicitly
requires this. Otherwise, lock-order reversals occur between pmap locks and
UMA zone locks. So, this change introduces a lock-less implementation of
pmap_kextract().

Disable recursion on the pvh global lock in the new armv6 pmap. While
recursion on this locks occurs in the old arm pmap, it thankfully doesn't
occur in the armv6 pmap.

Tested by: jmg


# 240913 25-Sep-2012 alc

Eliminate an unused declaration.


# 240803 22-Sep-2012 alc

Since UMA_ZONE_NOFREE is specified when l2zone and l2table_zone are created,
there is no need to release and reacquire the pmap and pvh global locks
around calls to uma_zfree(). Recursion into the pmap simply won't occur.

Eliminate the use of M_USE_RESERVE. It is deprecated and, in fact, counter-
productive, meaning that it actually makes the memory allocation request
more likely to fail.

Eliminate the macros pmap_{alloc,free}_l2_dtable(). They are of limited
utility, and pmap_free_l2_dtable() was inconsistently used.

Tidy up pmap_init(). In particular, change the initialization of the PV
zone so that it doesn't span the initialization of the l2 and l2table zones.

Tested by: jmg


# 240321 10-Sep-2012 alc

Replace all uses of the vm page queues lock by a r/w lock that is private
to this pmap.

Revise some comments.

The file vm/vm_param.h includes the file machine/vmparam.h, so there is no
need to directly include it.

Tested by: andrew


# 239369 18-Aug-2012 hrs

Fix build when DEBUG is defined.


# 239268 15-Aug-2012 gonzo

Merging projects/armv6, part 1

Cummulative patch of changes that are not vendor-specific:
- ARMv6 and ARMv7 architecture support
- ARM SMP support
- VFP/Neon support
- ARM Generic Interrupt Controller driver
- Simplification of startup code for all platforms