#
95ee2897 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: two-line .h pattern Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
#
5440e701 |
|
28-Jul-2023 |
Dmitry Chagin <dchagin@FreeBSD.org> |
i386: Don't use static DPCPU and VNET defines in i386 modules As of c84617e8 a similar to 4802a2cb and b6ea4c5a fix should be applied to i386 too. Reviewed by: Differential Revision: https://reviews.freebsd.org/D41195
|
#
b5d43972 |
|
21-Mar-2023 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: decouple freevnodes from vnode batching In principle one cpu can keep vholding vnodes, while another vdrops them. In this case it may be the local count will keep growing in an unbounded manner. Roll it up after a threshold instead. While here move it out of dpcpu into struct pcpu. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D39195
|
#
f1f98706 |
|
18-Apr-2021 |
Warner Losh <imp@FreeBSD.org> |
Minor style cleanup We prefer 'while (0)' to 'while(0)' according to grep and stlye(9)'s space after keyword rule. Remove a few stragglers of the latter. Many of these usages were inconsistent within the file. MFC After: 3 days Sponsored by: Netflix
|
#
dfff3776 |
|
12-Apr-2021 |
Mark Johnston <markj@FreeBSD.org> |
Rename struct device to struct _device types.h defines device_t as a typedef of struct device *. struct device is defined in subr_bus.c and almost all of the kernel uses device_t. The LinuxKPI also defines a struct device, so type confusion can occur. This causes bugs and ambiguity for debugging tools. Rename the FreeBSD struct device to struct _device. Reviewed by: gbe (man pages) Reviewed by: rpokala, imp, jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29676
|
#
bee115bc |
|
12-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
Dedup zpcpu assertions into one macro and guard the rest with #ifndef Sponsored by: The FreeBSD Foundation
|
#
3acb6572 |
|
12-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
Store offset into zpcpu allocations in the per-cpu area. This shorten zpcpu_get and allows more optimizations. Reviewed by: jeff Differential Revision: https://reviews.freebsd.org/D23570
|
#
d1e55387 |
|
10-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
Tidy up zpcpu_replace* - only compute the target address once - remove spurious type casting, zpcpu_get already return the correct type While here add missing newlines to other routines.
|
#
c77649d8 |
|
07-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
Add zpcpu_{set,add,sub}_protected. The _protected suffix follows counter(9).
|
#
98d97cde |
|
17-Dec-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
Convert zpcpu_* inlines to macros and add zpcpu_replace. This allows them to do basic type casting internally, effectively relieving consumers from having to cast on their own.
|
#
4cace859 |
|
16-Sep-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: convert struct mount counters to per-cpu There are 3 counters modified all the time in this structure - one for keeping the structure alive, one for preventing unmount and one for tracking active writers. Exact values of these counters are very rarely needed, which makes them a prime candidate for conversion to a per-cpu scheme, resulting in much better performance. Sample benchmark performing fstatfs (modifying 2 out of 3 counters) on a 104-way 2 socket Skylake system: before: 852393 ops/s after: 76682077 ops/s Reviewed by: kib, jeff Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21637
|
#
a2a0f906 |
|
29-Aug-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Centralize __pcpu definitions. Many extern struct pcpu <something>__pcpu declarations were copied/pasted in sources. The issue is that the definition is MD, but it cannot be provided by machine/pcpu.h due to actual struct pcpu defined in sys/pcpu.h later than the inclusion of machine/pcpu.h. This forced the copying when other code needed direct access to __pcpu. There is no way around it, due to machine/pcpu.h supplying part of struct pcpu fields. To work around the problem, add a new machine/pcpu_aux.h header, which should fill any needed MD definitions after struct pcpu definition is completed. This allows to remove copies of __pcpu spread around the source. Also on x86 it makes it possible to remove work arounds like OFFSETOF_CURTHREAD or clang specific warnings supressions. Reported and tested by: lwhsu, bcran Reviewed by: imp, markj (previous version) Discussed with: jhb Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21418
|
#
018ff686 |
|
12-Aug-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Move scheduler state into the per-cpu area where it can be allocated on the correct NUMA domain. Reviewed by: markj, gallatin Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D19315
|
#
e2edff41 |
|
25-Jun-2019 |
Leandro Lupori <luporl@FreeBSD.org> |
[PowerPC64] Don't mark module data as static Fixes panic when loading ipfw.ko and if_epair.ko built with modern compiler. Similar to arm64 and riscv, when using a modern compiler (!gcc4.2), code generated tries to access data in the wrong location, causing kernel panic (data storage interrupt trap) when loading if_epair and ipfw. Issue was reproduced with kernel/module compiled using gcc8 and clang8. It affects both ELFv1 and ELFv2 ABI environments. PR: 232387 Submitted by: alfredo.junior_eldorado.org.br Reported by: Mark Millard Reviewed by: jhibbits Differential Revision: https://reviews.freebsd.org/D20461
|
#
e2e050c8 |
|
19-May-2019 |
Conrad Meyer <cem@FreeBSD.org> |
Extract eventfilter declarations to sys/_eventfilter.h This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h" in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header pollution substantially. EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c files into appropriate headers (e.g., sys/proc.h, powernv/opal.h). As a side effect of reduced header pollution, many .c files and headers no longer contain needed definitions. The remainder of the patch addresses adding appropriate includes to fix those files. LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by sys/mutex.h since r326106 (but silently protected by header pollution prior to this change). No functional change (intended). Of course, any out of tree modules that relied on header pollution for sys/eventhandler.h, sys/lock.h, or sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped.
|
#
86c59375 |
|
12-Sep-2018 |
Ruslan Bukin <br@FreeBSD.org> |
Don't mark module data as static on RISC-V. Similar to arm64, riscv compiler uses PC-relative loads/stores, and with static data compiler does not emit relocations. In result, kernel module linker has nothing to fix and data accessed from the wrong location. Approved by: re (gjb) Sponsored by: DARPA, AFRL
|
#
b6ea4c5a |
|
30-Jul-2018 |
Andrew Turner <andrew@FreeBSD.org> |
As with DPCPU_DEFINE_STATIC make VNET_DEFINE_STATIC non-static on arm64 in modules. It also fails in the same way, we are unable to relocate static variables as the compiler uses PC-relative loads with nothing for the kernel linker to relocate. Sponsored by: DARPA, AFRL
|
#
4802a2cb |
|
16-Jul-2018 |
Andrew Turner <andrew@FreeBSD.org> |
Don't use the static keyword with DPCPU defines in arm64 modules. On arm64 compiler will create PC-relative loads and stores for static data. This means it doesn't emit a relocation. Unfortunately the in-kernel linker expects there to be one for DPCPU defines so it can modify its value so the code will use the correct DPCPU region. To workaround the lack of a relocation with static data remove it when building modules on arm64. The kernel is unaffected as it doesn't rely on modifying these relocations to find the data. PR: 225684 Reported by: Johannes Lundberg <johalun0@gmail.com> Reported by: Jose Luis Duran <jlduran@gmail.com> Reported by: Greg V <greg@unrelenting.technology> Reviewed by: bz Sponsored by: ABT Systems Ltd Differential Revision: https://reviews.freebsd.org/D16145
|
#
53dec71d |
|
06-Jul-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Expand x86 struct pcpus to UMA_PCPU_ALLOC_SIZE AKA PAGE_SIZE. This restores counters(9) operation. Revert r336024. Improve assert of pcpu size on x86. Reviewed by: mmacy Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D16163
|
#
fb0a2811 |
|
06-Jul-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Revert to recommit with the proper message.
|
#
16147166 |
|
06-Jul-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Save a call to pmap_remove() if entry cannot have any pages mapped. Due to the way rtld creates mappings for the shared objects, each dso causes unmap of at least three guard map entries. For instance, in the buildworld load, this change reduces the amount of pmap_remove() calls by 1/5. Profiled by: alc Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16148
|
#
ab3059a8 |
|
05-Jul-2018 |
Matt Macy <mmacy@FreeBSD.org> |
Back pcpu zone with domain correct pages - Change pcpu zone consumers to use a stride size of PAGE_SIZE. (defined as UMA_PCPU_ALLOC_SIZE to make future identification easier) - Allocate page from the correct domain for a given cpu. - Don't initialize pc_domain to non-zero value if NUMA is not defined There are some misconceptions surrounding this field. It is the _VM_ NUMA domain and should only ever correspond to valid domain values as understood by the VM. The former slab size of sizeof(struct pcpu) was somewhat arbitrary. The new value is PAGE_SIZE because that's the smallest granularity which the VM can allocate a slab for a given domain. If you have fewer than PAGE_SIZE/8 counters on your system there will be some memory wasted, but this is obviously something where you want the cache line to be coming from the correct domain. Reviewed by: jeff Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15933
|
#
2bf95012 |
|
05-Jul-2018 |
Andrew Turner <andrew@FreeBSD.org> |
Create a new macro for static DPCPU data. On arm64 (and possible other architectures) we are unable to use static DPCPU data in kernel modules. This is because the compiler will generate PC-relative accesses, however the runtime-linker expects to be able to relocate these. In preparation to fix this create two macros depending on if the data is global or static. Reviewed by: bz, emaste, markj Sponsored by: ABT Systems Ltd Differential Revision: https://reviews.freebsd.org/D16140
|
#
51369649 |
|
20-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
|
#
83c9dea1 |
|
17-Apr-2017 |
Gleb Smirnoff <glebius@FreeBSD.org> |
- Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter in place. To do per-cpu stats, convert all fields that previously were maintained in the vmmeters that sit in pcpus to counter(9). - Since some vmmeter stats may be touched at very early stages of boot, before we have set up UMA and we can do counter_u64_alloc(), provide an early counter mechanism: o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter. o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter, so that at early stages of boot, before counters are allocated we already point to a counter that can be safely written to. o For sparc64 that required a whole dummy pcpu[MAXCPU] array. Further related changes: - Don't include vmmeter.h into pcpu.h. - vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit, to match kernel representation. - struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion. This is based on benno@'s 4-year old patch: https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html Reviewed by: kib, gallatin, marius, lidl Differential Revision: https://reviews.freebsd.org/D10156
|
#
fbbd9655 |
|
28-Feb-2017 |
Warner Losh <imp@FreeBSD.org> |
Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96
|
#
650b9693 |
|
09-Aug-2016 |
Jean-Sébastien Pédron <dumbbell@FreeBSD.org> |
sys/pcpu.h: Revert change introduced in r303890 `device_t` is not defined outside the kernel but this header is used by eg. libkvm or vmstat(8). Thus, r303890 broke the build. So let's restore `struct device` here until a longer term solution is found. Reported by: Michael Butler <imb@protected-networks.net>, Jenkins MFC after: 3 days MFC with: r303890
|
#
bd937497 |
|
09-Aug-2016 |
Jean-Sébastien Pédron <dumbbell@FreeBSD.org> |
Consistently use `device_t` Several files use the internal name of `struct device` instead of `device_t` which is part of the public API. This patch changes all `struct device *` to `device_t`. The remaining occurrences of `struct device` are those referring to the Linux or OpenBSD version of the structure, or the code is not built on FreeBSD and it's unclear what to do. Submitted by: Matthew Macy <mmacy@nextbsd.org> (previous version) Approved by: emaste, jhibbits, sbruno MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D7447
|
#
7c6f639b |
|
04-May-2016 |
Enji Cooper <ngie@FreeBSD.org> |
Revert r299096 The change broke buildworld when building lib/libkvm This change likely needs to be run through a ports -exp run as a sanity check, as it might break downstream consumers. Pointyhat to: adrian Reported by: kargl (confirmed on $work workstation) Sponsored by: EMC / Isilon Storage Division
|
#
1b34262b |
|
04-May-2016 |
Adrian Chadd <adrian@FreeBSD.org> |
s/struct device */device_t/g Submitted by: kmacy
|
#
c25fabea |
|
27-Aug-2015 |
Mark Johnston <markj@FreeBSD.org> |
Remove weighted page handling from vm_page_advise(). This was added in r51337 as part of the implementation of madvise(MADV_DONTNEED). Its objective was to ensure that the page daemon would eventually reclaim other unreferenced pages (i.e., unreferenced pages not touched by madvise()) from the active queue. Now that the pagedaemon performs steady scanning of the active page queue, this weighted handling is unnecessary. Instead, always "cache" clean pages by moving them to the head of the inactive page queue. This simplifies the implementation of vm_page_advise() and eliminates the fragmentation that resulted from the distribution of pages among multiple queues. Suggested by: alc Reviewed by: alc Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3401
|
#
9188e408 |
|
10-Feb-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Add zpcpu_get_cpu() that converts base pointer of UMA_ZPCPU_ZONE to a pointer private to a given cpuid. Sponsored by: Nginx, Inc.
|
#
17dece86 |
|
08-Apr-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Merge from projects/counters: Pad struct pcpu so that its size is denominator of PAGE_SIZE. This is done to reduce memory waste in UMA_PCPU_ZONE zones. Sponsored by: Nginx, Inc.
|
#
6a612df1 |
|
17-Sep-2012 |
Attilio Rao <attilio@FreeBSD.org> |
Remove namespace pollution in _rmlock.h by defining rm_queue structure directly in _rmlock.h and then including it (and its dependencies) in pcpu.h. This leads to few _*.h headers to be included in pcpu.h but this is not considered a big deal. Really pc_rm_queue should be implemented as a dynamic member with DPCPU interface, but we really want to keep the read acquisition as fast as possible, so even the further pc_dynamic indirection should be avoided, and the pollution is dealt like this. Discussed with: jhb MFC after: 1 week
|
#
6338c579 |
|
19-Jul-2011 |
Attilio Rao <attilio@FreeBSD.org> |
Remove explicit MAXCPU usage from sys/pcpu.h avoiding a namespace pollution. That is a step further in the direction of building correct policies for userland and modules on how to deal with the number of maxcpus at runtime. Reported by: jhb Reviewed and tested by: pluknet Approved by: re (kib)
|
#
edf26ab8 |
|
19-Jul-2011 |
Attilio Rao <attilio@FreeBSD.org> |
Remove pc_name member of struct pcpu. pc_name is only included when KTR option is and it does introduce a subdle KBI breakage that totally breaks vmstat when world and kernel are not in sync. Besides, it is not used somewhere. In collabouration with: pluknet Reviewed by: jhb Approved by: re (kib)
|
#
2a292868 |
|
09-Jul-2011 |
Marius Strobl <marius@FreeBSD.org> |
Fix the definition for PCPU_NAME_LEN, which is intended to fit ("CPU %d", cpuid) where cpuid <= MAXCPU. 1. sizeof(__XSTRING(MAXCPU) + 1) is a typo: typeof(__XSTRING(...) + 1) is 'char *', so sizeof() will return the size of the pointer, not the size of the string contents. The proper expression should be 'sizeof(__XSTRING(MAXCPU)) + 1'. 2. One should not add one, but substract it: sizeof() accounts for the trailing '\0' and we have two sizeof's, so the size of one '\0' should be substracted -- this will give the maximal string buffer length for CPU with its number, no less, no more. Submitted by: rea
|
#
a2f4e284 |
|
04-Jul-2011 |
Attilio Rao <attilio@FreeBSD.org> |
Completely remove now unused pc_other_cpus, pc_cpumask. Tested by: pluknet
|
#
d098f930 |
|
31-May-2011 |
Nathan Whitehorn <nwhitehorn@FreeBSD.org> |
On multi-core, multi-threaded PPC systems, it is important that the threads be brought up in the order they are enumerated in the device tree (in particular, that thread 0 on each core be brought up first). The SLIST through which we loop to start the CPUs has all of its entries added with SLIST_INSERT_HEAD(), which means it is in reverse order of enumeration and so AP startup would always fail in such situations (causing a machine check or RTAS failure). Fix this by changing the SLIST into an STAILQ, and inserting new CPUs at the end. Reviewed by: jhb
|
#
d5880f9c |
|
27-May-2011 |
Attilio Rao <attilio@FreeBSD.org> |
In the near future cpuset_t objects in struct pcpu will be axed out, but as long as this does not happen, we need to fix interfaces to userland in order to not break run-time accesses to the structure. Reviwed by: kib Tested by: pluknet
|
#
71a19bdc |
|
05-May-2011 |
Attilio Rao <attilio@FreeBSD.org> |
Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno
|
#
3e288e62 |
|
22-Nov-2010 |
Dimitry Andric <dim@FreeBSD.org> |
After some off-list discussion, revert a number of changes to the DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless. Changes reverted: ------------------------------------------------------------------------ r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined. ------------------------------------------------------------------------ r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree. ------------------------------------------------------------------------ r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
|
#
c3adda9f |
|
14-Nov-2010 |
Dimitry Andric <dim@FreeBSD.org> |
Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined.
|
#
47d46d92 |
|
14-Nov-2010 |
Dimitry Andric <dim@FreeBSD.org> |
Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
|
#
5f67450d |
|
14-Nov-2010 |
Dimitry Andric <dim@FreeBSD.org> |
Similar to sys/net/vnet.h, define the linker set name for sys/sys/pcpu.h as a macro, and use it instead of literal strings.
|
#
4403994d |
|
11-Nov-2010 |
Dimitry Andric <dim@FreeBSD.org> |
Use the same treatment as in linker_set.h for the __start and __stop symbols of the set_vnet and set_pcpu sections, so those symbols will always be emitted in kernel modules, if they use vnet.h or pcpu.h. Also, for pcpu.h, make the __(start|stop)_set_pcpu declarations, and associated macros invisible to userland, to prevent it picking up these symbols. Reviewed by: kib
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
d9481554 |
|
15-Sep-2010 |
Andriy Gapon <avg@FreeBSD.org> |
sys/pcpu.h: remove a workaround for a fixed ld bug The workaround was incorrectly documented as having something to do with set_pcpu section's progbits, but in fact it was for incorrect placement of __start_set_pcpu because of the bug in ld. The bug was fixed in r210245, see commit message for details. A side-effect of the workaround was that a zero-size set_pcpu section was produced for modules, source code of which included pcpu.h but didn't actually define any dynamic per-cpu variables. This commit should remove the side-effect. The same workaround is present sys/net/vnet.h, has an analogous side-effect and can be removed as well. An UPDATING entry that warns about a need for recent ld is following. MFC after: 1 month
|
#
a3870a18 |
|
27-Jul-2010 |
John Baldwin <jhb@FreeBSD.org> |
Very rough first cut at NUMA support for the physical page allocator. For now it uses a very dumb first-touch allocation policy. This will change in the future. - Each architecture indicates the maximum number of supported memory domains via a new VM_NDOMAIN parameter in <machine/vmparam.h>. - Each cpu now has a PCPU_GET(domain) member to indicate the memory domain a CPU belongs to. Domain values are dense and numbered from 0. - When a platform supports multiple domains, the default freelist (VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain. The MD code is required to populate an array of mem_affinity structures. Each entry in the array defines a range of memory (start and end) and a domain for the range. Multiple entries may be present for a single domain. The list is terminated by an entry where all fields are zero. This array of structures is used to split up phys_avail[] regions that fall in VM_FREELIST_DEFAULT into per-domain freelists. - Each memory domain has a separate lookup-array of freelists that is used when fulfulling a physical memory allocation. Right now the per-domain freelists are listed in a round-robin order for each domain. In the future a table such as the ACPI SLIT table may be used to order the per-domain lookup lists based on the penalty for each memory domain relative to a specific domain. The lookup lists may be examined via a new vm.phys.lookup_lists sysctl. - The first-touch policy is implemented by using PCPU_GET(domain) to pick a lookup list when allocating memory. Reviewed by: alc
|
#
96f2f82a |
|
16-Jul-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
Unbreak DPCPU_SUM() by dereferencing the pointer returned by DPCPU_ID_PTR(). MFC after: 3 days
|
#
7d1d80fd |
|
13-Jul-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
- The sum variable used in DPCPU_SUM needs to be of the same type as the DPCPU variable, rather than a pointer to the type. - Zero # bytes equivalent to sizeof(object), not sizeof(ptr_to_object). - Remove an unnecessary __typeof. Sponsored by: FreeBSD Foundation Submitted by: jmallet MFC after: 3 days
|
#
aac762c2 |
|
13-Jul-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
Macro to simplify zeroing DPCPU variables. Sponsored by: FreeBSD Foundation MFC after: 3 days
|
#
07de7d5f |
|
13-Jul-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
- Rename DPCPU_SUM to DPCPU_VARSUM to better reflect the fact it operates on member variables of a DPCPU struct. - Add DPCPU_SUM which sums a DPCPU variable. Sponsored by: FreeBSD Foundation MFC after: 3 days
|
#
cb0bd51d |
|
18-Jun-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
- Rename the internal for loop iterator to "_i" to avoid potential shadowing of external variables named "i". The "_" prefix is reserved for infrastructure type code and is therefore not expected to be used by normal code likely to call DPCPU_SUM(). [1] - Change DPCPU_SUM to return the sum rather than calculate and assign it internally. Usage is now: "sum = DPCPU_SUM(dpcpu_var, member_to_sum);" [2] - Fix some style nits. [3] Sponsored by: FreeBSD Foundation Suggested by: bde [3], mdf [1], kib [1,2], pjd [1,3] Reviewed by: kib MFC after: 1 week (instead of r209119)
|
#
5ad333cf |
|
12-Jun-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
Add a utility macro to simplify calculating an aggregate sum from a DPCPU counter variable. Sponsored by: FreeBSD Foundation Reviewed by: jhb, rpaulo, rwatson (previous version of patch) MFC after: 1 week
|
#
567e51e1 |
|
24-May-2010 |
Alan Cox <alc@FreeBSD.org> |
Roughly half of a typical pmap_mincore() implementation is machine- independent code. Move this code into mincore(), and eliminate the page queues lock from pmap_mincore(). Push down the page queues lock into pmap_clear_modify(), pmap_clear_reference(), and pmap_is_modified(). Assert that these functions are never passed an unmanaged page. Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m: Contrary to what the comment says, pmap_mincore() is not simply an optimization. Without a complete pmap_mincore() implementation, mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED because only the pmap can provide this information. Eliminate the page queues lock from vfs_setdirty_locked_object(), vm_pageout_clean(), vm_object_page_collect_flush(), and vm_object_page_clean(). Generally speaking, these are all accesses to the page's dirty field, which are synchronized by the containing vm object's lock. Reduce the scope of the page queues lock in vm_object_madvise() and vm_page_dontneed(). Reviewed by: kib (an earlier version)
|
#
e80b9043 |
|
30-Mar-2010 |
John Baldwin <jhb@FreeBSD.org> |
Use CACHE_LINE_SIZE alignment for 'struct pcpu' rather than hardcoding 128. Reviewed by: jeff
|
#
495072de |
|
30-Mar-2010 |
John Baldwin <jhb@FreeBSD.org> |
Various and sundry style, whitespace, and comment fixes. Submitted by: bde (mostly)
|
#
33962e6d |
|
10-Mar-2010 |
John Baldwin <jhb@FreeBSD.org> |
Typo.
|
#
28a2b3dd |
|
12-Aug-2009 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
MFC r196118: Put minimum alignment on the dpcpu and vnet section so that ld when adding the __start_ symbol knows the expected section alignment and can place the __start_ symbol correctly. These sections will not support symbols with super-cache line alignment requirements. For full details, see posting to freebsd-current, 2009-08-10, Message-ID: <20090810133111.C93661@maildrop.int.zabbadoz.net>. Debugging and testing patches by: Kamigishi Rei (spambox haruhiism.net), np, lstewart, jhb, kib, rwatson Tested by: Kamigishi Rei, lstewart Reviewed by: kib Approved by: re
|
#
1b501e53 |
|
12-Aug-2009 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Put minimum alignment on the dpcpu and vnet section so that ld when adding the __start_ symbol knows the expected section alignment and can place the __start_ symbol correctly. These sections will not support symbols with super-cache line alignment requirements. For full details, see posting to freebsd-current, 2009-08-10, Message-ID: <20090810133111.C93661@maildrop.int.zabbadoz.net>. Debugging and testing patches by: Kamigishi Rei (spambox haruhiism.net), np, lstewart, jhb, kib, rwatson Tested by: Kamigishi Rei, lstewart Reviewed by: kib Approved by: re
|
#
eddfbb76 |
|
14-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
|
#
50c202c5 |
|
23-Jun-2009 |
Jeff Roberson <jeff@FreeBSD.org> |
Implement a facility for dynamic per-cpu variables. - Modules and kernel code alike may use DPCPU_DEFINE(), DPCPU_GET(), DPCPU_SET(), etc. akin to the statically defined PCPU_*. Requires only one extra instruction more than PCPU_* and is virtually the same as __thread for builtin and much faster for shared objects. DPCPU variables can be initialized when defined. - Modules are supported by relocating the module's per-cpu linker set over space reserved in the kernel. Modules may fail to load if there is insufficient space available. - Track space available for modules with a one-off extent allocator. Free may block for memory to allocate space for an extent. Reviewed by: jhb, rwatson, kan, sam, grehan, marius, marcel, stas
|
#
869c29a7 |
|
05-Jun-2009 |
John Baldwin <jhb@FreeBSD.org> |
Trim old remnants of per-CPU KTR buffers. Submitted by: Eygene Ryabinkin
|
#
d4b5cae4 |
|
01-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Reimplement the netisr framework in order to support parallel netisr threads: - Support up to one netisr thread per CPU, each processings its own workstream, or set of per-protocol queues. Threads may be bound to specific CPUs, or allowed to migrate, based on a global policy. In the future it would be desirable to support topology-centric policies, such as "one netisr per package". - Allow each protocol to advertise an ordering policy, which can currently be one of: NETISR_POLICY_SOURCE: packets must maintain ordering with respect to an implicit or explicit source (such as an interface or socket). NETISR_POLICY_FLOW: make use of mbuf flow identifiers to place work, as well as allowing protocols to provide a flow generation function for mbufs without flow identifers (m2flow). Falls back on NETISR_POLICY_SOURCE if now flow ID is available. NETISR_POLICY_CPU: allow protocols to inspect and assign a CPU for each packet handled by netisr (m2cpuid). - Provide utility functions for querying the number of workstreams being used, as well as a mapping function from workstream to CPU ID, which protocols may use in work placement decisions. - Add explicit interfaces to get and set per-protocol queue limits, and get and clear drop counters, which query data or apply changes across all workstreams. - Add a more extensible netisr registration interface, in which protocols declare 'struct netisr_handler' structures for each registered NETISR_ type. These include name, handler function, optional mbuf to flow ID function, optional mbuf to CPU ID function, queue limit, and ordering policy. Padding is present to allow these to be expanded in the future. If no queue limit is declared, then a default is used. - Queue limits are now per-workstream, and raised from the previous IFQ_MAXLEN default of 50 to 256. - All protocols are updated to use the new registration interface, and with the exception of netnatm, default queue limits. Most protocols register as NETISR_POLICY_SOURCE, except IPv4 and IPv6, which use NETISR_POLICY_FLOW, and will therefore take advantage of driver- generated flow IDs if present. - Formalize a non-packet based interface between interface polling and the netisr, rather than having polling pretend to be two protocols. Provide two explicit hooks in the netisr worker for start and end events for runs: netisr_poll() and netisr_pollmore(), as well as a function, netisr_sched_poll(), to allow the polling code to schedule netisr execution. DEVICE_POLLING still embeds single-netisr assumptions in its implementation, so for now if it is compiled into the kernel, a single and un-bound netisr thread is enforced regardless of tunable configuration. In the default configuration, the new netisr implementation maintains the same basic assumptions as the previous implementation: a single, un-bound worker thread processes all deferred work, and direct dispatch is enabled by default wherever possible. Performance measurement shows a marginal performance improvement over the old implementation due to the use of batched dequeue. An rmlock is used to synchronize use and registration/unregistration using the framework; currently, synchronized use is disabled (replicating current netisr policy) due to a measurable 3%-6% hit in ping-pong micro-benchmarking. It will be enabled once further rmlock optimization has taken place. However, in practice, netisrs are rarely registered or unregistered at runtime. A new man page for netisr will follow, but since one doesn't currently exist, it hasn't been updated. This change is not appropriate for MFC, although the polling shutdown handler should be merged to 7-STABLE. Bump __FreeBSD_version. Reviewed by: bz
|
#
0d2cf837 |
|
25-Jan-2009 |
Jeff Roberson <jeff@FreeBSD.org> |
- Use __XSTRING where I want the define to be expanded. This resulted in sizeof("MAXCPU") being used to calculate a string length rather than something more reasonable such as sizeof("32"). This shouldn't have caused any ill effect until we run on machines with 1000000 or more cpus.
|
#
8f51ad55 |
|
17-Jan-2009 |
Jeff Roberson <jeff@FreeBSD.org> |
- Implement generic macros for producing KTR records that are compatible with src/tools/sched/schedgraph.py. This allows developers to quickly create a graphical view of ktr data for any resource in the system. - Add sched_tdname() and the pcpu field 'name' for quickly and uniformly identifying records associated with a thread or cpu. - Reimplement the KTR_SCHED traces using the new generic facility. Obtained from: attilio Discussed with: jhb Sponsored by: Nokia
|
#
d7f03759 |
|
19-Oct-2008 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Import the HEAD csup code which is the basis for the cvsmode work.
|
#
70d12a18 |
|
19-Aug-2008 |
John Baldwin <jhb@FreeBSD.org> |
Export 'struct pcpu' to userland w/o requiring _KERNEL. A few ports already define _KERNEL to get to this and I'm about to add hooks to libkvm to access per-CPU data. MFC after: 1 week
|
#
b75e2d0b |
|
06-Mar-2008 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Move the PCPU_MD_FIELDS last in struct pcpu. While this header is private to the kernel, some ports define _KERNEL and include this header. While arguably this is wrong, it's also reality. By having the MD fields last, architectures that have CPU-specific variations of PCPU_MD_FIELDS will at least have the MI fields at a constant offset. Of course, having all MI fields first helps kernel debugging as well, so this is not a change without some benefits to us. This change does not result in an ABI breakage, because this header is not part of the ABI. Recompilation of lsof is required though :-)
|
#
1ad19157 |
|
14-Dec-2007 |
David E. O'Brien <obrien@FreeBSD.org> |
Add comment to pc_cp_time.
|
#
7628402b |
|
28-Nov-2007 |
Peter Wemm <peter@FreeBSD.org> |
Move the shared cp_time array (counts %sys, %user, %idle etc) to the per-cpu area. cp_time[] goes away and a new function creates a merged cp_time-like array for things like linprocfs, sysctl etc. The atomic ops for updating cp_time[] in statclock go away, and the scope of the thread lock is reduced. sysctl kern.cp_time returns a backwards compatible cp_time[] array. A new kern.cp_times sysctl returns the individual per-cpu stats. I have pending changes to make top and vmstat optionally show per-cpu stats. I'm very aware that there are something like 5 or 6 other versions "out there" for doing this - but none were handy when I needed them. I did merge my changes with John Baldwin's, and ended up replacing a few chunks of my stuff with his, and stealing some other code. Reviewed by: jhb Partly obtained from: jhb
|
#
f53d15fe |
|
08-Nov-2007 |
Stephan Uphoff <ups@FreeBSD.org> |
Initial checkin for rmlock (read mostly lock) a multi reader single writer lock optimized for almost exclusive reader access. (see also rmlock.9) TODO: Convert to per cpu variables linkerset as soon as it is available. Optimize UP (single processor) case.
|
#
42ce445f |
|
06-Jun-2007 |
David Xu <davidxu@FreeBSD.org> |
Backout experimental adaptive-spin umtx code.
|
#
c640357f |
|
10-Mar-2007 |
Alan Cox <alc@FreeBSD.org> |
Push down the implementation of PCPU_LAZY_INC() into the machine-dependent header file. Reimplement PCPU_LAZY_INC() on amd64 and i386 making it atomic with respect to interrupts. Reviewed by: bde, jhb
|
#
4e32b7b3 |
|
19-Dec-2006 |
David Xu <davidxu@FreeBSD.org> |
Add a lwpid field into per-cpu structure, the lwpid represents current running thread's id on each cpu. This allow us to add in-kernel adaptive spin for user level mutex. While spinning in user space is possible, without correct thread running state exported from kernel, it hardly can be implemented efficiently without wasting cpu cycles, however exporting thread running state unlikely will be implemented soon as it has to design and stablize interfaces. This implementation is transparent to user space, it can be disabled dynamically. With this change, mutex ping-pong program's performance is improved massively on SMP machine. performance of mysql super-smack select benchmark is increased about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems which have bunch of cpus and system-call overhead is low (athlon64, opteron, and core-2 are known to be fast), the adaptive spin does help performance. Added sysctls: kern.threads.umtx_dflt_spins if the sysctl value is non-zero, a zero umutex.m_spincount will cause the sysctl value to be used a spin cycle count. kern.threads.umtx_max_spins the sysctl sets upper limit of spin cycle count. Tested on: Athlon64 X2 3800+, Dual Xeon 5130
|
#
e0b65125 |
|
29-Nov-2006 |
John Birrell <jb@FreeBSD.org> |
Turn console printf buffering into a kernel option and only on by default for sun4v where it is absolutely required. This change moves the buffer from struct pcpu to the stack to avoid using the critical section which created a LOR in a couple of cases due to interaction with the tty code and kqueue. The LOR can't be fixed with the critical section and the pcpu buffer can't be used without the critical section. Putting the buffer on the stack was my initial solution, but it was pointed out that the stress on the stack might cause problems depending on the call path. We don't have a way of creating tests for those possible cases, so it's best to leave this as an option for the time being. In time we may get enough data to enable this option more generally.
|
#
3d068827 |
|
31-Oct-2006 |
John Birrell <jb@FreeBSD.org> |
Add a cnputs() function to write a string to the console with a lock to prevent interspersed strings written from different CPUs at the same time. To avoid putting a buffer on the stack or having to malloc one, space is incorporated in the per-cpu structure. The buffer size if 128 bytes; chosen because it's the next power of 2 size up from 80 characters. String writes to the console are buffered up the end of the line or until the buffer fills. Then the buffer is flushed to all console devices. Existing low level console output via cnputc() is unaffected by this change. ithread calls to log() are also unaffected to avoid blocking those threads. A minor change to the behaviour in a panic situation is that console output will still be buffered, but won't be written to a tty as before. This should prevent interspersed panic output as a number of CPUs panic before we end up single threaded running ddb. Reviewed by: scottl, jhb MFC after: 2 weeks
|
#
5b1a8eb3 |
|
07-Feb-2006 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Modify the way we account for CPU time spent (step 1) Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers. For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.) On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen.
|
#
0e56f395 |
|
26-Apr-2005 |
John Baldwin <jhb@FreeBSD.org> |
Drop the CURPROC, curkse, and curksegrp aliases as they aren't used anywhere.
|
#
7849b49a |
|
26-Apr-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Add 'curcpu', a shortcut to the current CPU ID, similar to curthread, curproc, et al. Useful for indexing into per-CPU data structures. MFC after: 2 weeks
|
#
60727d8b |
|
06-Jan-2005 |
Warner Losh <imp@FreeBSD.org> |
/* -> /*- for license, minor formatting changes
|
#
b2ae7ed7 |
|
27-Mar-2004 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Change the type of the various CPU masks to cpumask_t. Note that as long as there are still explicit uses of int, whether in types or in function names (such as atomic_set_int() in sched_ule.c), we can not change cpumask_t to be anything other than u_int. See also the commit log for sys/sys/types.h, revision 1.84.
|
#
29f5b9a8 |
|
08-Mar-2004 |
Nate Lawson <njl@FreeBSD.org> |
Hook CPUs up to newbus. CPUs will ultimately be a bus driver so that multiple CPU-specific drivers can attach. This is a work in progress so children aren't supported yet. Help from: jhb
|
#
de8e370e |
|
20-Nov-2003 |
Peter Wemm <peter@FreeBSD.org> |
Allow the MD backend to provide an alternative to #define curthread PCPU_GET(curthread) since its so heavily used in the kernel and ripe for compile-speed optimization on some platforms.
|
#
696058c3 |
|
09-Dec-2002 |
Julian Elischer <julian@FreeBSD.org> |
Unbreak the KSE code. Keep track of zobie threads using the Per-CPU storage during the context switch. Rearrange thread cleanups to avoid problems with Giant. Clean threads when freed or when recycled. Approved by: re (jhb)
|
#
004998bc |
|
20-Aug-2002 |
John Baldwin <jhb@FreeBSD.org> |
Whitespace and style fixes. Submitted by: bde
|
#
80f5c8bf |
|
04-Apr-2002 |
Matthew Dillon <dillon@FreeBSD.org> |
Embed a struct vmmeter in the per-cpu structure and add a macro, PCPU_LAZY_INC() which increments elements in it for cases where we can afford the occassional inaccuracy. Use of per-cpu stats counters avoids significant cache stalls in various critical paths that would otherwise severely limit our cpu scaleability. Adjust all sysctl's accessing cnt.* elements to now use a procedure which aggregates the requested field for all cpus and for the global vmmeter. The global vmmeter is retained, since some stats counters, like v_free_min, cannot be made per-cpu. Also, this allows us to convert counters from the global vmmeter to the per-cpu vmmeter in a piecemeal fashion, so have at it!
|
#
8d0747c9 |
|
20-Mar-2002 |
John Baldwin <jhb@FreeBSD.org> |
Document that MD pcpu fields are defined in PCPU_MD_FIELDS in machine/pcpu.h. Requested by: dillon
|
#
181df8c9 |
|
26-Feb-2002 |
Matthew Dillon <dillon@FreeBSD.org> |
revert last commit temporarily due to whining on the lists.
|
#
f96ad4c2 |
|
26-Feb-2002 |
Matthew Dillon <dillon@FreeBSD.org> |
STAGE-1 of 3 commit - allow (but do not require) interrupts to remain enabled in critical sections and streamline critical_enter() and critical_exit(). This commit allows an architecture to leave interrupts enabled inside critical sections if it so wishes. Architectures that do not wish to do this are not effected by this change. This commit implements the feature for the I386 architecture and provides a sysctl, debug.critical_mode, which defaults to 1 (use the feature). For now you can turn the sysctl on and off at any time in order to test the architectural changes or track down bugs. This commit is just the first stage. Some areas of the code, specifically the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will be cleaned up in the STAGE-2 commit when the critical_*() functions are moved entirely into MD files. The following changes have been made: * critical_enter() and critical_exit() for I386 now simply increment and decrement curthread->td_critnest. They no longer disable hard interrupts. When critical_exit() decrements the counter to 0 it effectively calls a routine to deal with whatever interrupts were deferred during the time the code was operating in a critical section. Other architectures are unaffected. * fork_exit() has been conditionalized to remove MD assumptions for the new code. Old code will still use the old MD assumptions in regards to hard interrupt disablement. In STAGE-2 this will be turned into a subroutine call into MD code rather then hardcoded in MI code. The new code places the burden of entering the critical section in the trampoline code where it belongs. * I386: interrupts are now enabled while we are in a critical section. The interrupt vector code has been adjusted to deal with the fact. If it detects that we are in a critical section it currently defers the interrupt by adding the appropriate bit to an interrupt mask. * In order to accomplish the deferral, icu_lock is required. This is i386-specific. Thus icu_lock can only be obtained by mainline i386 code while interrupts are hard disabled. This change has been made. * Because interrupts may or may not be hard disabled during a context switch, cpu_switch() can no longer simply assume that PSL_I will be in a consistent state. Therefore, it now saves and restores eflags. * FAST INTERRUPT PROVISION. Fast interrupts are currently deferred. The intention is to eventually allow them to operate either while we are in a critical section or, if we are able to restrict the use of sched_lock, while we are not holding the sched_lock. * ICU and APIC vector assembly for I386 cleaned up. The ICU code has been cleaned up to match the APIC code in regards to format and macro availability. Additionally, the code has been adjusted to deal with deferred interrupts. * Deferred interrupts use a per-cpu boolean int_pending, and masks ipending, spending, and fpending. Being per-cpu variables it is not currently necessary to lock; bus cycles modifying them. Note that the same mechanism will enable preemption to be incorporated as a true software interrupt without having to further hack up the critical nesting code. * Note: the old critical_enter() code in kern/kern_switch.c is currently #ifdef to be compatible with both the old and new methodology. In STAGE-2 it will be moved entirely to MD code. Performance issues: One of the purposes of this commit is to enhance critical section performance, specifically to greatly reduce bus overhead to allow the critical section code to be used to protect per-cpu caches. These caches, such as Jeff's slab allocator work, can potentially operate very quickly making the effective savings of the new critical section code's performance very significant. The second purpose of this commit is to allow architectures to enable certain interrupts while in a critical section. Specifically, the intention is to eventually allow certain FAST interrupts to operate rather then defer. The third purpose of this commit is to begin to clean up the critical_enter()/critical_exit()/cpu_critical_enter()/ cpu_critical_exit() API which currently has serious cross pollution in MI code (in fork_exit() and ast() for example). The fourth purpose of this commit is to provide a framework that allows kernel-preempting software interrupts to be implemented cleanly. This is currently used for two forward interrupts in I386. Other architectures will have the choice of using this infrastructure or building the functionality directly into critical_enter()/ critical_exit(). Finally, this commit is designed to greatly improve the flexibility of various architectures to manage critical section handling, software interrupts, preemption, and other highly integrated architecture-specific details.
|
#
1cbb9c3b |
|
22-Feb-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Convert p->p_runtime and PCPU(switchtime) to bintime format.
|
#
ab8061d8 |
|
05-Jan-2002 |
Peter Wemm <peter@FreeBSD.org> |
Add a per-cpu variable, cpumask, the preshifted equivalent of 1 << cpuid. We use this around the place a lot.
|
#
0bbc8826 |
|
11-Dec-2001 |
John Baldwin <jhb@FreeBSD.org> |
Overhaul the per-CPU support a bit: - The MI portions of struct globaldata have been consolidated into a MI struct pcpu. The MD per-CPU data are specified via a macro defined in machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the interface would be cleaner (PCPU_GET(my_md_field) vs. PCPU_GET(md.md_my_md_field)). - All references to globaldata are changed to pcpu instead. In a UP kernel, this data was stored as global variables which is where the original name came from. In an SMP world this data is per-CPU and ideally private to each CPU outside of the context of debuggers. This also included combining machine/globaldata.h and machine/globals.h into machine/pcpu.h. - The pointer to the thread using the FPU on i386 was renamed from npxthread to fpcurthread to be identical with other architectures. - Make the show pcpu ddb command MI with a MD callout to display MD fields. - The globaldata_register() function was renamed to pcpu_init() and now init's MI fields of a struct pcpu in addition to registering it with the internal array and list. - A pcpu_destroy() function was added to remove a struct pcpu from the internal array and list. Tested on: alpha, i386 Reviewed by: peter, jake
|
#
ba228f6d |
|
10-May-2001 |
John Baldwin <jhb@FreeBSD.org> |
- Split out the support for per-CPU data from the SMP code. UP kernels have per-CPU data and gdb on the i386 at least needs access to it. - Clean up includes in kern_idle.c and subr_smp.c. Reviewed by: jake
|