History log of /freebsd-current/sys/x86/x86/intr_machdep.c
Revision Date Author Comments
# d9817976 09-May-2024 Elliott Mitchell <ehem+freebsd@m5p.com>

intr/x86: replace use of vector in interface with intsrc

Several x86 interrupt core functions were already operating on intsrc
structures. Now switch the remaining 3 to intsrc for consistency.

Swap the order of intr_add_handler()'s first two arguments. This
matches INTRNG and is more consistent with other functions in this
interface.

Differential Revision: https://reviews.freebsd.org/D35386
Reviewed by: imp, markj (previous version)
Pull Request: https://github.com/freebsd/freebsd-src/pull/1126


# 792655ab 07-Aug-2023 Ed Maste <emaste@FreeBSD.org>

x86: make EARLY_AP_STARTUP mandatory

When early AP startup was introduced in 2016 it was put behind a kernel
option EARLY_AP_STARTUP as a transition aid, so that it could be turned
off if necessary. For x86 the non-EARLY_AP_STARTUP case is no longer
functional, so disallow it.

Other archs are still incompatible with EARLY_AP_STARTUP, so the option
cannot yet be removed entirely.

Reported by: wollman
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41351


# 4258eb5a 17-Aug-2023 Ed Maste <emaste@FreeBSD.org>

x86: handle domains with no CPUs usable for intr delivery

We can end up with a domain having no CPUs capable of receiving I/O
interrupts. This can occur, for example, when all APIC IDs in a given
domain are 256 or greater, and we have no IOMMU.

In this case disable per-domain interrupt support, effectively reverting
to the behaviour before commit a48de40bcc09 ("Only use CPUs in the
domain the device is attached to for default"). This has a performance
impact but at least allows the system to be functional. It is a stop-
gap until we can rely on the presence of an IOMMU on all x86 platforms.

Thanks to AMD for providing the high-thread-count machine I used for
testing this change, and to cperciva for testing on other hardware.

Reviewed by: jhb
Tested by: cperciva, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41501


# 5ad59b91 19-Jan-2023 Elliott Mitchell <ehem+freebsd@m5p.com>

intr: merge interrupt table uses of MAXCOMLEN into INTRNAME_LEN

The repeated uses of `MAXCOMLEN + 1` seem a bit hazardous. If there was
a future need to change the size, the repeats will be troublesome.
Merge everything into `#define INTRNAME_LEN` (matches the name used by
INTRNG).

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D38455


# d8099e33 19-Jun-2022 Elliott Mitchell <ehem+freebsd@m5p.com>

intr: move MAX_STRAY_LOG to interrupt.h

The two interrupt controllers which implement squelching of reports
after a maximum use the same limit. Move the limit to interrupt.h, the
better to encourage other interrupt controllers to implement the same.

Reviewed by: markj
MFC after: 2 weks
Differential Revision: https://reviews.freebsd.org/D35527


# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# fe5d8f7a 04-Aug-2023 Ed Maste <emaste@FreeBSD.org>

x86: include CPU ID in "Invalid CPU ID" panic

Sponsored by: The FreeBSD Foundation


# eee65376 06-Oct-2022 Elliott Mitchell <ehem+freebsd@m5p.com>

x86: remove intr_bind

`intr_bind(u_int vector, u_char cpu);` looked suspicious since
everywhere else "cpu" is a u_int and >256 processors isn't unreasonable
now. `intr_bind()` is not used anywhere in FreeBSD (now, after commit
bf42f3738087). Time to remove.

Relnotes: Yes
Reviewed by: mjg
Differential Revision: https://reviews.freebsd.org/D36901


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# 1a352c3c 19-Oct-2019 Conrad Meyer <cem@FreeBSD.org>

hw.intrbalance: Make sysctl tunable

This allows specifying a boot-time preference in loader.conf.


# a9126164 04-Oct-2019 Eric van Gyzen <vangyzen@FreeBSD.org>

Make the hw.intrs sysctl OID read-only

The handler ignores the new value, so make the OID read-only.

I found this while working on r353111.

MFC after: 2 weeks
Sponsored by: Dell EMC Isilon


# 2e43efd0 06-Mar-2019 John Baldwin <jhb@FreeBSD.org>

Drop "All rights reserved" from my copyright statements.

Reviewed by: rgrimes
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D19485


# cdb6aa7e 31-Aug-2018 John Baldwin <jhb@FreeBSD.org>

Fix build of x86 UP kernels after dynamic IRQ changes in r338360.

Reported by: Ian FREISLICH <ian.freislich@capeaugusta.com>
Approved by: re (gjb)
MFC after: 2 weeks


# fd036dea 28-Aug-2018 John Baldwin <jhb@FreeBSD.org>

Dynamically allocate IRQ ranges on x86.

Previously, x86 used static ranges of IRQ values for different types
of I/O interrupts. Interrupt pins on I/O APICs and 8259A PICs used
IRQ values from 0 to 254. MSI interrupts used a compile-time-defined
range starting at 256, and Xen event channels used a
compile-time-defined range after MSI. Some recent systems have more
than 255 I/O APIC interrupt pins which resulted in those IRQ values
overflowing into the MSI range triggering an assertion failure.

Replace statically assigned ranges with dynamic ranges. Do a single
pass computing the sizes of the IRQ ranges (PICs, MSI, Xen) to
determine the total number of IRQs required. Allocate the interrupt
source and interrupt count arrays dynamically once this pass has
completed. To minimize runtime complexity these arrays are only sized
once during bootup. The PIC range is determined by the PICs present
in the system. The MSI and Xen ranges continue to use a fixed size,
though this does make it possible to turn the MSI range size into a
tunable in the future.

As a result, various places are updated to use dynamic limits instead
of constants. In addition, the vmstat(8) utility has been taught to
understand that some kernels may treat 'intrcnt' and 'intrnames' as
pointers rather than arrays when extracting interrupt stats from a
crashdump. This is determined by the presence (vs absence) of a
global 'nintrcnt' symbol.

This change reverts r189404 which worked around a buggy BIOS which
enumerated an I/O APIC twice (using the same memory mapped address for
both entries but using an IRQ base of 256 for one entry and a valid
IRQ base for the second entry). Making the "base" of MSI IRQ values
dynamic avoids the panic that r189404 worked around, and there may now
be valid I/O APICs with an IRQ base above 256 which this workaround
would incorrectly skip.

If in the future the issue reported in PR 130483 reoccurs, we will
have to add a pass over the I/O APIC entries in the MADT to detect
duplicates using the memory mapped address and use some strategy to
choose the "correct" one.

While here, reserve room in intrcnts for the Hyper-V counters.

PR: 229429, 130483
Reviewed by: kib, royger, cem
Tested by: royger (Xen), kib (DMAR)
Approved by: re (gjb)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D16861


# 27a3c9d7 28-Mar-2018 Jeff Roberson <jeff@FreeBSD.org>

Restore r331606 with a bugfix to setup cpuset_domain[] earlier on all
platforms. Original commit message as follows:

Only use CPUs in the domain the device is attached to for default
assignment. Device drivers are able to override the default assignment
if they bind directly. There are severe performance penalties for
handling interrupts on remote CPUs and this should only be done in
very controlled circumstances.

Reviewed by: jhb, kib
Tested by: pho
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14838


# 261c4087 27-Mar-2018 Jeff Roberson <jeff@FreeBSD.org>

Backout r331606 until I can identify why it does not boot on some
machines.


# a48de40b 26-Mar-2018 Jeff Roberson <jeff@FreeBSD.org>

Only use CPUs in the domain the device is attached to for default
assignment. Device drivers are able to override the default assignment
if they bind directly. There are severe performance penalties for
handling interrupts on remote CPUs and this should only be done in
very controlled circumstances.

Reviewed by: jhb, kib
Tested by: pho (earlier version)
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14838


# 33099716 23-Feb-2018 Konstantin Belousov <kib@FreeBSD.org>

Do not return out of bound pointers from intr_lookup_source().

This hardens the code against driver and upper level bugs causing
invalid indexes used, e.g. on msi release.

Reported by: gallatin
Reviewed by: gallatin, hselasky
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D14470


# ebf5747b 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/x86: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# 35d87c7e 16-Aug-2017 Conrad Meyer <cem@FreeBSD.org>

Fix unused varable warning in !SMP case

Fallout from r322588. I'm not sure why !SMP is a knob we have, but, we have
it.

Reported by: Michael Butler <imb AT protected-networks.net>
Sponsored by: Dell EMC Isilon


# dc6a8280 16-Aug-2017 Conrad Meyer <cem@FreeBSD.org>

x86: Add dynamic interrupt rebalancing

Add an option to dynamically rebalance interrupts across cores
(hw.intrbalance); off by default.

The goal is to minimize preemption. By placing interrupt sources on distinct
CPUs, ithreads get preferentially scheduled on distinct CPUs. Overall
preemption is reduced and latency is reduced. In our workflow it reduced
"fighting" between two high-frequency interrupt sources. Reduced latency
was proven by, e.g., SPEC2008.

Submitted by: jeff@ (earlier version)
Reviewed by: kib@
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D10435


# fecabb72 14-Jun-2017 John Baldwin <jhb@FreeBSD.org>

Don't try to assign interrupts to a CPU on single-CPU systems.

All interrupts are routed to the sole CPU in that case implicitly.
This is a regression in EARLY_AP_STARTUP. Previously the 'assign_cpu'
variable was only set when a multi-CPU system finished booting, so
it's value both meant that interrupts could be assigned and that
there was more than one CPU.

PR: 219882
Reported by: ota@j.email.ne.jp
MFC after: 3 days


# 83c9dea1 17-Apr-2017 Gleb Smirnoff <glebius@FreeBSD.org>

- Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter
in place. To do per-cpu stats, convert all fields that previously were
maintained in the vmmeters that sit in pcpus to counter(9).
- Since some vmmeter stats may be touched at very early stages of boot,
before we have set up UMA and we can do counter_u64_alloc(), provide an
early counter mechanism:
o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter.
o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter,
so that at early stages of boot, before counters are allocated we already
point to a counter that can be safely written to.
o For sparc64 that required a whole dummy pcpu[MAXCPU] array.

Further related changes:
- Don't include vmmeter.h into pcpu.h.
- vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit,
to match kernel representation.
- struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion.

This is based on benno@'s 4-year old patch:
https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html

Reviewed by: kib, gallatin, marius, lidl
Differential Revision: https://reviews.freebsd.org/D10156


# 9ed01c32 17-Apr-2017 Gleb Smirnoff <glebius@FreeBSD.org>

All these files need sys/vmmeter.h, but now they got it implicitly
included via sys/pcpu.h.


# 2b375b4e 27-Jan-2017 Yoshihiro Takahashi <nyan@FreeBSD.org>

Remove pc98 support completely.
I thank all developers and contributors for pc98.

Relnotes: yes


# b9f62e3a 11-Sep-2016 Sepherosa Ziehau <sephe@FreeBSD.org>

x86: Use sx lock for interrupt sources.

- Certain pic_assign_cpu, e.g. msi_assign_cpu can have quite a long
call chain. For msi_assign_cpu, mutex makes complex PCI bridge
drivers more tricky, e.g. sleep can note be called, etc, it will
be pretty tricky for upcoming Hyper-V PCI bridge driver for PCI
pass-through.
- It is not used on any hot code path nor non-sleepable context, so
sx should have the same effect as mutex.

PIC list is still protected by mutex to keep suspend/resume work.

Discussed with: jhb
Reviewed by: jhb
MFC after: 3 weeks
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D7784


# 23006680 29-Jul-2016 Roger Pau Monné <royger@FreeBSD.org>

Revert r291022: x86/intr: allow mutex recursion in intr_remove_handler

This was only needed for Xen, and a better way to deal with this issue has
been found, so this commit can be reverted.

Sponsored by: Citrix Systems R&D
MFC after: 5 days
Reviewed by: kib
Differential revision: https://reviews.freebsd.org/D7363


# fdce57a0 14-May-2016 John Baldwin <jhb@FreeBSD.org>

Add an EARLY_AP_STARTUP option to start APs earlier during boot.

Currently, Application Processors (non-boot CPUs) are started by
MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until
SI_SUB_SMP at which point they are released to run kernel threads.
SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter
the scheduler and start running threads until fairly late in the
boot.

This change moves SI_SUB_SMP up to just before software interrupt
threads are created allowing the APs to start executing kernel
threads much sooner (before any devices are probed). This allows
several initialization routines that need to perform initialization
on all CPUs to now perform that initialization in one step rather
than having to defer the AP initialization to a second SYSINIT run
at SI_SUB_SMP. It also permits all CPUs to be available for
handling interrupts before any devices are probed.

This last feature fixes a problem on with interrupt vector exhaustion.
Specifically, in the old model all device interrupts were routed
onto the boot CPU during boot. Later after the APs were released at
SI_SUB_SMP, interrupts were redistributed across all CPUs.

However, several drivers for multiqueue hardware allocate N interrupts
per CPU in the system. In a system with many CPUs, just a few drivers
doing this could exhaust the available pool of interrupt vectors on
the boot CPU as each driver was allocating N * mp_ncpu vectors on the
boot CPU. Now, drivers will allocate interrupts on their desired CPUs
during boot meaning that only N interrupts are allocated from the boot
CPU instead of N * mp_ncpu.

Some other bits of code can also be simplified as smp_started is
now true much earlier and will now always be true for these bits of
code. This removes the need to treat the single-CPU boot environment
as a special case.

As a transition aid, the new behavior is available under a new kernel
option (EARLY_AP_STARTUP). This will allow the option to be turned off
if need be during initial testing. I plan to enable this on x86 by
default in a followup commit in the next few days and to have all
platforms moved over before 11.0. Once the transition is complete,
the option will be removed along with the !EARLY_AP_STARTUP code.

These changes have only been tested on x86. Other platform maintainers
are encouraged to port their architectures over as well. The main
things to check for are any uses of smp_started in MD code that can be
simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in
the EARLY_AP_STARTUP case (e.g. the interrupt shuffling).

PR: kern/199321
Reviewed by: markj, gnn, kib
Sponsored by: Netflix


# 8d791e5a 09-May-2016 John Baldwin <jhb@FreeBSD.org>

Add a new bus method to fetch device-specific CPU sets.

bus_get_cpus() returns a specified set of CPUs for a device. It accepts
an enum for the second parameter that indicates the type of cpuset to
request. Currently two valus are supported:

- LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to
the device when DEVICE_NUMA is enabled)
- INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core)

For systems that do not support NUMA (or if it is not enabled in the kernel
config), LOCAL_CPUS fails with EINVAL. INTR_CPUS is mapped to 'all_cpus'
by default. The idea is that INTR_CPUS should always return a valid set.

Device drivers which want to use per-CPU interrupts should start using
INTR_CPUS instead of simply assigning interrupts to all available CPUs.
In the future we may wish to add tunables to control the policy of
INTR_CPUS (e.g. should it be local-only or global, should it ignore
SMT threads or not).

The x86 nexus driver exposes the internal set of interrupt CPUs from the
the x86 interrupt code via INTR_CPUS.

The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable
LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled. They also and
the global INTR_CPUS set from the nexus driver with the per-domain set from
_PXM to generate a local INTR_CPUS set for child devices.

Compared to the r298933, this version uses 'struct _cpuset' in
<sys/bus.h> instead of 'cpuset_t' to avoid requiring <sys/param.h>
(<sys/_cpuset.h> still requires <sys/param.h> for MAXCPU even though
<sys/_bitset.h> does not after recent changes).


# 8a08b7d3 02-May-2016 John Baldwin <jhb@FreeBSD.org>

Revert bus_get_cpus() for now.

I really thought I had run this through the tinderbox before committing,
but many places need <sys/types.h> -> <sys/param.h> for <sys/bus.h> now.


# bc153c69 02-May-2016 John Baldwin <jhb@FreeBSD.org>

Add a new bus method to fetch device-specific CPU sets.

bus_get_cpus() returns a specified set of CPUs for a device. It accepts
an enum for the second parameter that indicates the type of cpuset to
request. Currently two valus are supported:

- LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to
the device when DEVICE_NUMA is enabled)
- INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core)

For systems that do not support NUMA (or if it is not enabled in the kernel
config), LOCAL_CPUS fails with EINVAL. INTR_CPUS is mapped to 'all_cpus'
by default. The idea is that INTR_CPUS should always return a valid set.

Device drivers which want to use per-CPU interrupts should start using
INTR_CPUS instead of simply assigning interrupts to all available CPUs.
In the future we may wish to add tunables to control the policy of
INTR_CPUS (e.g. should it be local-only or global, should it ignore
SMT threads or not).

The x86 nexus driver exposes the internal set of interrupt CPUs from the
the x86 interrupt code via INTR_CPUS.

The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable
LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled. They also and
the global INTR_CPUS set from the nexus driver with the per-domain set from
_PXM to generate a local INTR_CPUS set for child devices.

Reviewed by: wblock (manpage)
Differential Revision: https://reviews.freebsd.org/D5519


# 7a2c1d8c 23-Mar-2016 John Baldwin <jhb@FreeBSD.org>

Enable interrupts on the BSP once all PICs are initialized.

This moves the enabling of interrupts slightly earlier (the old location
was still before devices were enumerated and probed) and does it in the
interrupt code (rather than in the device configuration code). This
also avoids tripping over an assertion on the first TLB shootdown with
earlier AP startup.

Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D5710


# 531cfe55 18-Nov-2015 Roger Pau Monné <royger@FreeBSD.org>

x86/intr: allow mutex recursion in intr_remove_handler

This is needed so interrupt handlers can be removed while the PIC is
resuming, it was previously not possible due to intr_resume holding the
intr_table_lock and intr_remove_handler recursing on it.

Sponsored by: Citrix Systems R&D
Reviewed by: kib (previous version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D4114


# ed95805e 30-Apr-2015 John Baldwin <jhb@FreeBSD.org>

Remove support for Xen PV domU kernels. Support for HVM domU kernels
remains. Xen is planning to phase out support for PV upstream since it
is harder to maintain and has more overhead. Modern x86 CPUs include
virtualization extensions that support HVM guests instead of PV guests.
In addition, the PV code was i386 only and not as well maintained recently
as the HVM code.
- Remove the i386-only NATIVE option that was used to disable certain
components for PV kernels. These components are now standard as they
are on amd64.
- Remove !XENHVM bits from PV drivers.
- Remove various shims required for XEN (e.g. PT_UPDATES_FLUSH, LOAD_CR3,
etc.)
- Remove duplicate copy of <xen/features.h>.
- Remove unused, i386-only xenstored.h.

Differential Revision: https://reviews.freebsd.org/D2362
Reviewed by: royger
Tested by: royger (i386/amd64 HVM domU and amd64 PVH dom0)
Relnotes: yes


# 0a110d5b 19-Mar-2015 Konstantin Belousov <kib@FreeBSD.org>

Use VT-d interrupt remapping block (IR) to perform FSB messages
translation. In particular, despite IO-APICs only take 8bit apic id,
IR translation structures accept 32bit APIC Id, which allows x2APIC
mode to function properly. Extend msi_cpu of struct msi_intrsrc and
io_cpu of ioapic_intsrc to full int from one byte.

KPI of IR is isolated into the x86/iommu/iommu_intrmap.h, to avoid
bringing all dmar headers into interrupt code. The non-PCI(e) devices
which generate message interrupts on FSB require special handling. The
HPET FSB interrupts are remapped, while DMAR interrupts are not.

For each msi and ioapic interrupt source, the iommu cookie is added,
which is in fact index of the IRE (interrupt remap entry) in the IR
table. Cookie is made at the source allocation time, and then used at
the map time to fill both IRE and device registers. The MSI
address/data registers and IO-APIC redirection registers are
programmed with the special values which are recognized by IR and used
to restore the IRE index, to find proper delivery mode and target.
Map all MSI interrupts in the block when msi_map() is called.

Since an interrupt source setup and dismantle code are done in the
non-sleepable context, flushing interrupt entries cache in the IR
hardware, which is done async and ideally waits for the interrupt,
requires busy-wait for queue to drain. The dmar_qi_wait_for_seq() is
modified to take a boolean argument requesting busy-wait for the
written sequence number instead of waiting for interrupt.

Some interrupts are configured before IR is initialized, e.g. ACPI
SCI. Add intr_reprogram() function to reprogram all already
configured interrupts, and call it immediately before an IR unit is
enabled. There is still a small window after the IO-APIC redirection
entry is reprogrammed with cookie but before the unit is enabled, but
to fix this properly, IR must be started much earlier.

Add workarounds for 5500 and X58 northbridges, some revisions of which
have severe flaws in handling IR. Use the same identification methods
as employed by Linux.

Review: https://reviews.freebsd.org/D1892
Reviewed by: neel
Discussed with: jhb
Tested by: glebius, pho (previous versions)
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks


# 066da805 17-Sep-2014 Adrian Chadd <adrian@FreeBSD.org>

Migrate ie->ie_assign_cpu and associated code to use an int for CPU rather
than u_char.

Migrate post_filter to use an int for a CPU rather than u_char.

Change intr_event_bind() to use an int for CPU rather than u_char.

It touches the ppc, sparc64, arm and mips machdep code but it should
(hah!) be a no-op.

Tested:

* i386, AMD64 laptops

Reviewed by: jhb


# f79309d2 19-Mar-2014 Warner Losh <imp@FreeBSD.org>

Remove vestiges of knowing the ISA bus, which we gave up on around 20
years ago. Remove redunant copy of isaregs.h.


# e432d5f6 05-Feb-2014 John Baldwin <jhb@FreeBSD.org>

Drop the 3rd clause from all 3 clause BSD licenses where I am the sole
holder to convert them to 2 clause BSD licenses.

MFC after: 1 week


# 428b7ca2 19-Sep-2013 Justin T. Gibbs <gibbs@FreeBSD.org>

Add support for suspend/resume/migration operations when running as a
Xen PVHVM guest.

Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
Reviewed by: gibbs
Approved by: re (blanket Xen)
MFC after: 2 weeks

sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
- Make sure that are no MMU related IPIs pending on migration.
- Reset pending IPI_BITMAP on resume.
- Init vcpu_info on resume.

sys/amd64/include/intr_machdep.h:
sys/i386/include/intr_machdep.h:
sys/x86/acpica/acpi_wakeup.c:
sys/x86/x86/intr_machdep.c:
sys/x86/isa/atpic.c:
sys/x86/x86/io_apic.c:
sys/x86/x86/local_apic.c:
- Add a "suspend_cancelled" parameter to pic_resume(). For the
Xen PIC, restoration of interrupt services differs between
the aborted suspend and normal resume cases, so we must provide
this information.

sys/dev/acpica/acpi_timer.c:
sys/dev/xen/timer/timer.c:
sys/timetc.h:
- Don't swap out "suspend safe" timers across a suspend/resume
cycle. This includes the Xen PV and ACPI timers.

sys/dev/xen/control/control.c:
- Perform proper suspend/resume process for PVHVM:
- Suspend all APs before going into suspension, this allows us
to reset the vcpu_info on resume for each AP.
- Reset shared info page and callback on resume.

sys/dev/xen/timer/timer.c:
- Implement suspend/resume support for the PV timer. Since FreeBSD
doesn't perform a per-cpu resume of the timer, we need to call
smp_rendezvous in order to correctly resume the timer on each CPU.

sys/dev/xen/xenpci/xenpci.c:
- Don't reset the PCI interrupt on each suspend/resume.

sys/kern/subr_smp.c:
- When suspending a PVHVM domain make sure there are no MMU IPIs
in-flight, or we will get a lockup on resume due to the fact that
pending event channels are not carried over on migration.
- Implement a generic version of restart_cpus that can be used by
suspended and stopped cpus.

sys/x86/xen/hvm.c:
- Implement resume support for the hypercall page and shared info.
- Clear vcpu_info so it can be reset by APs when resuming from
suspension.

sys/dev/xen/xenpci/xenpci.c:
sys/x86/xen/hvm.c:
sys/x86/xen/xen_intr.c:
- Support UP kernel configurations.

sys/x86/xen/xen_intr.c:
- Properly rebind per-cpus VIRQs and IPIs on resume.


# 548b2016 01-Feb-2013 Andriy Gapon <avg@FreeBSD.org>

x86 suspend/resume: suspend pics and pseudo-pics in reverse order

- change 'pics' from STAILQ to TAILQ
- ensure that Local APIC is always first in 'pics'

Reviewed by: jhb
Tested by: Sergey V. Dyatko <sergey.dyatko@gmail.com>,
KAHO Toshikazu <kaho@elam.kais.kyoto-u.ac.jp>
MFC after: 12 days


# af2bdaca 08-Oct-2012 Attilio Rao <attilio@FreeBSD.org>

Reverts r234074,234105,234564,234723,234989,235231-235232 and part of
r234247.
Use, instead, the static intializer introduced in r239923 for x86 and
sparc64 intr_cpus, unwinding the code to the initial version.

Reviewed by: marius


# b8be27bf 03-May-2012 Attilio Rao <attilio@FreeBSD.org>

Revert part of r234723 by re-enabling the SMP protection for
intr_bind() on x86.
This has been requested by jhb and I strongly disagree with this,
but as long as he is the x86 and interrupt subsystem maintainer I will
follow his directives.

The disagreement cames from what we should really consider as a
public KPI. IMHO, if we really need a selection between the kernel
functions, we may need an explicit protection like _KERNEL_KPI, which
defines which subset of the kernel function might really be considered
as part of the KPI (for thirdy part modules) and which not.
As long as we don't have this mechanism I just consider any possible
function as usable by thirdy part code, thus intr_bind() included.

MFC after: 1 week


# 70dbd160 26-Apr-2012 Attilio Rao <attilio@FreeBSD.org>

Clean up the intr* MD KPI from the SMP dependency, removing a cause of
discrepancy between modules and kernel, but deal with SMP differences
within the functions themselves.

As an added bonus this also helps in terms of code readability.

Requested by: gibbs
Reviewed by: jhb, marius
MFC after: 1 week


# 47c77b22 06-Apr-2012 Justin T. Gibbs <gibbs@FreeBSD.org>

Fix interrupt load balancing regression, introduced in revision
222813, that left all un-pinned interrupts assigned to CPU 0.

sys/x86/x86/intr_machdep.c:
In intr_shuffle_irqs(), remove CPU_SETOF() call that initialized
the "intr_cpus" cpuset to only contain CPU0.

This initialization is too late and nullifies the results of calls
the intr_add_cpu() that occur much earlier in the boot process.
Since "intr_cpus" is statically initialized to the empty set, and
all processors, including the BSP, already add themselves to
"intr_cpus" no special initialization for the BSP is necessary.

MFC after: 3 days


# dff207f8 15-Mar-2012 Yoshihiro Takahashi <nyan@FreeBSD.org>

- Fix to build a native i386 kernel without the SMP and atpic.
- Merge r232744 changes to pc98.
(Allow a kernel to be built with 'nodevice atpic'.)
- Move ICU related defines from x86/isa/atpic.c to x86/isa/icu.h and
use them in x86/x86/intr_machdep.c.

Reviewed by: jhb


# 646af7c6 09-Mar-2012 John Baldwin <jhb@FreeBSD.org>

Move i386's intr_machdep.c to the x86 tree and share it with amd64.