History log of /freebsd-10.1-release/lib/libmemstat/memstat_uma.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 272461 02-Oct-2014 gjb

Copy stable/10@r272459 to releng/10.1 as part of
the 10.1-RELEASE process.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 251894 18-Jun-2013 jeff

Refine UMA bucket allocation to reduce space consumption and improve
performance.

- Always free to the alloc bucket if there is space. This gives LIFO
allocation order to improve hot-cache performance. This also allows
for zones with a single bucket per-cpu rather than a pair if the entire
working set fits in one bucket.
- Enable per-cpu caches of buckets. To prevent recursive bucket
allocation one bucket zone still has per-cpu caches disabled.
- Pick the initial bucket size based on a table driven maximum size
per-bucket rather than the number of items per-page. This gives
more sane initial sizes.
- Only grow the bucket size when we face contention on the zone lock, this
causes bucket sizes to grow more slowly.
- Adjust the number of items per-bucket to account for the header space.
This packs the buckets more efficiently per-page while making them
not quite powers of two.
- Eliminate the per-zone free bucket list. Always return buckets back
to the bucket zone. This ensures that as zones grow into larger
bucket sizes they eventually discard the smaller sizes. It persists
fewer buckets in the system. The locking is slightly trickier.
- Only switch buckets in zalloc, not zfree, this eliminates pathological
cases where we ping-pong between two buckets.
- Ensure that the thread that fills a new bucket gets to allocate from
it to give a better upper bound on allocation time.

Sponsored by: EMC / Isilon Storage Division


# 242152 26-Oct-2012 mdf

Const-ify the zone name argument to uma_zcreate(9).

MFC after: 3 days


# 225330 02-Sep-2011 pluknet

Cosmetic cleanup: remove #define LIBMEMSTAT used to prevent a nested
include of opt_vmpage.h from vm/vm_page.h. opt_vmpage.h was retired
before 7.0 together with options PQ_NOOPT.

Approved by: re (kib)
MFC after: 3 days


# 224569 01-Aug-2011 pluknet

Get rid of MAXCPU knowledge used for internal needs only. Switch to
dynamic memory allocation to hold per-CPU memory types data (sized to
mp_maxid for UMA, and to mp_maxcpus for malloc to match the kernel).

That fixes libmemstat with arbitrary large MAXCPU values and therefore
eliminates MEMSTAT_ERROR_TOOMANYCPUS error type.

Reviewed by: jhb
Approved by: re (kib)


# 222813 07-Jun-2011 attilio

etire the cpumask_t type and replace it with cpuset_t usage.

This is intended to fix the bug where cpu mask objects are
capped to 32. MAXCPU, then, can now arbitrarely bumped to whatever
value. Anyway, as long as several structures in the kernel are
statically allocated and sized as MAXCPU, it is suggested to keep it
as low as possible for the time being.

Technical notes on this commit itself:
- More functions to handle with cpuset_t objects are introduced.
The most notable are cpusetobj_ffs() (which calculates a ffs(3)
for a cpuset_t object), cpusetobj_strprint() (which prepares a string
representing a cpuset_t object) and cpusetobj_strscan() (which
creates a valid cpuset_t starting from a string representation).
- pc_cpumask and pc_other_cpus are target to be removed soon.
With the moving from cpumask_t to cpuset_t they are now inefficient
and not really useful. Anyway, for the time being, please note that
access to pcpu datas is protected by sched_pin() in order to avoid
migrating the CPU while reading more than one (possible) word
- Please note that size of cpuset_t objects may differ between kernel
and userland. While this is not directly related to the patch itself,
it is good to understand that concept and possibly use the patch
as a reference on how to deal with cpuset_t objects in userland, when
accessing kernland members.
- KTR_CPUMASK is changed and now is represented through a string, to be
set as the example reported in NOTES.

Please additively note that no MAXCPU is bumped in this patch, but
private testing has been done until to MAXCPU=128 on a real 8x8x2(htt)
machine (amd64).

Please note that the FreeBSD version is not yet bumped because of
the upcoming pcpu changes. However, note that this patch is not
targeted for MFC.

People to thank for the time spent on this patch:
- sbruno, pluknet and Nicholas Esborn (nick AT desert DOT net) tested
several revision of the patches and really helped in improving
stability of this work.
- marius fixed several bugs in the sparc64 implementation and reviewed
patches related to ktr.
- jeff and jhb discussed the basic approach followed.
- kib and marcel made targeted review on some specific part of the
patch.
- marius, art, nwhitehorn and andreast reviewed MD specific part of
the patch.
- marius, andreast, gonzo, nwhitehorn and jceel tested MD specific
implementations of the patch.
- Other people have made contributions on other patches that have been
already committed and have been listed separately.

Companies that should be mentioned for having participated at several
degrees:
- Yahoo! for having offered the machines used for testing on big
count of CPUs.
- The FreeBSD Foundation for having sponsored my devsummit attendance,
which has been instrumental.
- Sandvine for having offered offices and infrastructure during
development.

(I really hope I didn't forget anyone, if it happened I apologize in
advance).


# 209215 15-Jun-2010 sbruno

Add a new column to the output of vmstat -z to indicate the number
of times the system was forced to sleep when requesting a new allocation.

Expand the debugger hook, db_show_uma, to display these results as well.

This has proven to be very useful in out of memory situations when
it is not known why systems have become sluggish or fail in odd ways.

Reviewed by: rwatson alc
Approved by: scottl (mentor) peter
Obtained from: Yahoo Inc.


# 155552 11-Feb-2006 rwatson

Update copyright for 2006.

MFC after: 3 days


# 155550 11-Feb-2006 rwatson

The uma_zone data structure defines the size of its uz_cpu[] array as 1,
but then sizes the containing data structure at run-time to make room
for per-cpu cache data. Modify libmemstat to separately allocate a
buffer to hold per-cpu cache data, sized based on the run-time mp_maxid
variable when using libkvm to access UMA data. This avoids reading
invalid cache data from beyond the end of the uma_zone data structure
on the stack, which can result in invalid statistics and/or reads from
invalid kernel addresses.

Foot target practice by: ps
MFC after: 3 days


# 155549 11-Feb-2006 rwatson

When reporting an error reading from UMA per-cpu cache pointers using KVM,
return a KVM error rather than an out of memory error, so that the caller
reports the KVM error state. This replaces a misleading error message
with a more accurate although equally confusing one.

MFC after: 3 days


# 155547 11-Feb-2006 rwatson

Read all_cpus variable out of kmem, and validate CPUs against the all_cpus
cpu mask before looking at the cache entries for the CPU. For systems
with sparse CPU id arrays, this skips otherwise uninitialized cache
structures.

MFC after: 3 days


# 155542 11-Feb-2006 rwatson

Correct a typo in the extraction of zone information from UMA using kmem:
bytes = allocated - freed, not bytes = allocated = freed.

MFC after: 3 days


# 154416 15-Jan-2006 rwatson

Remove unnecessary and undesirable 'static' from function-local keg
list, which could cause problems for multi-threaded applications
using libmemstat to monitor UMA in more than one thread
simultaneously.

MFC after: 3 days


# 148693 04-Aug-2005 rwatson

Define LIBMEMSTAT so that vm_page.h won't perform a nested include of
opt_vmpage.h.

Remove definition of _KERNEL, it is no longer required in order to
include uma_int.h, as the sensitive parts of uma_int.h (a number of
inlines depending on kernel-only constants) are now protected by
_KERNEL.


# 148627 01-Aug-2005 rwatson

Add memstat_kvm_uma(), an implementation of a libmemstat(3) query routine
that knows how to extract UMA(9) allocator statistics from a core dump or
live memory image using kvm(3). The caller is expected to provide the
necessary kvm_t handle, which is then used by libmemstat(3).

With these changes, it is trivially straight forward to re-introduce
vmstat -z support on core dumps, which was lost when UMA was introduced.

In the short term, this requires including vm/ include files that are not
intended for extra-kernel use, requiring in turn some ugliness.


# 148619 01-Aug-2005 rwatson

Correct two libmemstat(3) bugs:

- Move memory_type_list flushing logic from memstat_mtl_free() to
_memstat_mtl_empty(), a libmemstat-internal function that can
be called from other parts of the library. Invoke
_memstat_mtl_empty() from memstat_mtl_free(), which also frees
the containing list structure.

Invoke _memstat_mtl_empty() instead of memstat_mtl_free() in
various error cases in memstat_malloc.c and memstat_uma.c, which
previously resulted in the list being freed prematurely.

- Reverse the order of updating the mt_kegfree and mt_free fields
of the memory_type in memstat_uma.c, otherwise keg free items
won't be counted properly for non-secondary zones.

MFC after: 3 days


# 148381 25-Jul-2005 rwatson

If a retrieved UMA zone is a secondary zone, don't report keg free items,
as they actually belong to the primary zone, and maye otherwise be
reported more than once.

MFC after: 1 day


# 148357 23-Jul-2005 rwatson

Introduce more formal error handling for libmemstat(3):

- Define a set of libmemstat(3) error constants, which are used by all
libmemstat(3) methods except for memstat_mtl_alloc(), which allocates
a memory type list and may return ENOMEM via errno.

- Define a per-memory_type_list current error value, which is set when a
call associated with a memory list fails. This requires wrapping a
structure around the queue(9) list head data structure, but this change
is not visible to libmemstat(3) consumers due to using access methods.

- Add a new accessor method, memstat_mtl_geterror() to retrieve the error
number.

- Consistently set the error number in a number of failure modes where
previously some combination of setting errno and printf'ing error
descriptions was used. libmemstat(3) will now no longer print to stdio
under any circumstances. Returns of NULL/-1 for errors remain the
same.

This avoids use of stdio, misuse of error numbers, and should make it
easier to program a libmemstat(3) consumer able to print useful error
messages. Currently, no error-to-string function is provided, as I'm
unsure how to address internationalization concerns.

MFC after: 1 day


# 148354 23-Jul-2005 rwatson

Prefix two non-static libmemstat(3) internal functions with '_' symbols, to
try and discourage use outside the library.

Remove duplicate declaration of memstat_mtl_free() from memstat_internal.h,
as it's not internal, and the memstat.h definition suffices.


# 148170 20-Jul-2005 rwatson

UMA supports "secondary" zones, in which a second zone can be layered
on top of a primary zone, sharing the same allocation "keg". When
reporting statistics for zones, do not report the free items in the
keg as part of the free items in the zone, or those free items will
be reported more than once: for the primary zone, and then any
secondary zones off the primary zone. Separately record and maintain
a kegfree statistic, and export via memstat_get_kegfree(), which is
available for use if needed. Since items free'd back to the keg are
not fully initialized, and hence may not actually be available (since
secondary zone ctor-time initialization can fail), this makes some
amount of sense.

This change corrects a bug made visible in the libmemstat(3)
modifications to netstat: mbufs freed back to the keg from the
packet zone would be counted twice, resulting in negative values
being printed in the mbuf free count.

Some further refinement of reporting relating to secondary zones may
still be required.

Reported by: ssouhlal
MFC after: 3 days


# 148071 15-Jul-2005 rwatson

Teach libmemstat(3) about UMA(9) failure statistics.

Requested by: victor cruceru <victor dot cruceru at gmail dot com>
MFC after: 1 week


# 148038 15-Jul-2005 rwatson

Re-spell wronge less wrongly as wrong.

Submitted by: jkoshy
MFC after: 1 week


# 148007 14-Jul-2005 rwatson

Properly combine per-CPU UMA cache allocation and free counts with the
global counters maintained in the zone.

MFC after: 1 week


# 147997 14-Jul-2005 rwatson

Add libmemstat(3), a library for use by debugging and monitoring
applications in tracking kernel memory statistics. It provides an
abstracted interface to uma(9) and malloc(9) statistics, wrapped
around the recently added binary stream sysctls for the allocators.

Using this interface, it is easy to build monitoring tools, query
specific memory types for usage information, etc. Facilities are
provided for binding caller-provided data to memory types,
incremental updates of memory types, and queries that span multiple
allocators.

Support for additional allocators is (relatively) easy to add.

The API for libmemstat(3) will probably change some over time as
consumers are written, and requirements evolve. It is written to
avoid encoding ABIs for data structure layout into consuming
applications for this reason.

MFC after: 1 week