#
ec8a394d |
|
11-Apr-2024 |
Elyes Haouas <ehaouas@noos.fr> |
usr.sbin: Remove repeated words Signed-off-by: Elyes Haouas <ehaouas@noos.fr> Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/887
|
#
95b948c1 |
|
19-Feb-2024 |
Jessica Clarke <jrtc27@jrtc27.com> |
bhyve: Fix arm64 PCI I/O range to match FDT This is supposed to combine with the memory range to make one contiguous block, as is laid out in the FDT, so make this match what the OS is told and thus actually configures. Also drop the confusing leading zero from all three of these constants that is making these 9 rather than 8 hex digits long (as one would expect for a 32-bit address). Reviewed by: jhb MFC after: 2 weeks Obtained from: CheriBSD
|
#
0efad4ac |
|
16-Feb-2024 |
Jessica Clarke <jrtc27@jrtc27.com> |
bhyve: Support legacy PCI interrupts on arm64 This allows us to remove various #ifdef hacks and enable building more PCI devices. Note that a hole is left in the interrupt mapping for the RTC rather than having the two core devices straddle the PCIe interrupts. QEMU's virt machine also takes this approach. Reviewed by: jhb MFC after: 2 weeks Obtained from: CheriBSD
|
#
dc6a00f2 |
|
03-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Use vm_raise_msi() instead of vm_lapic_msi() No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41740
|
#
f286f746 |
|
03-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Add PCI mappings for arm64 - The extended config space and BAR ranges are listed in the FDT. - Avoid referencing I/O ports in ACPI tables. Currently the arm64 port does not support ACPI in any case. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41739
|
#
fc98569f |
|
03-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Do not compile PCI passthrough support on arm64 Some required kernel functionality is not yet implemented. For now this means that one cannot specify host PCI register values, but that functionality is only used by amd64-specific device models for now. Note that this limitation is rather artificial; it arises only because pci_host_read_config() lives in pci_passthru.c. Reviewed by: corvink, andrew, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41738
|
#
e497fe86 |
|
02-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Use vm_get_highmem_base() instead of hard-coding the value This reduces the coupling between libvmmapi (which creates the highmem segment) and bhyve, in preparation for the arm64 port. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D40992
|
#
4d65a7c6 |
|
24-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
usr.sbin: Automated cleanup of cdefs and other formatting Apply the following automated changes to try to eliminate no-longer-needed sys/cdefs.h includes as well as now-empty blank lines in a row. Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/ Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/ Remove /\n+#if.*\n#endif.*\n+/ Remove /^#if.*\n#endif.*\n/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/ Sponsored by: Netflix
|
#
31cf78c9 |
|
03-Oct-2023 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Make most I/O port handling specific to amd64 - The qemu_fwcfg interface, as implemented, is I/O port-based, but QEMU implements an MMIO interface that we'll eventually want to port for arm64. - Retain support for I/O space PCI BARs, simply treat them like MMIO BARs for most purposes, similar to what the arm64 kernel does. Such BARs are created by virtio devices. Reviewed by: corvink, jhb MFC after: 1 week Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D40741
|
#
55c13f6e |
|
03-Oct-2023 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Move legacy PCI interrupt handling under amd64/ Specifically, move IO-APIC, LPC and PIRQ routing code under amd64/. Use ifdefs to conditionally compile related code in other files. In particular, legacy PCI interrupt handling is now compiled only on amd64. This is not too invasive, but suggestions for a more modular approach would be appreciated. I am not sure why qemu fwcfg handling is tied to LPC, and I suspect it should be decoupled. In this commit I just apply an ifdef hammer, but we will eventually want fwcfg on arm64 as well. No functional change intended. Reviewed by: corvink, jhb MFC after: 1 week Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D40739
|
#
01d53c34 |
|
03-Oct-2023 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Improve pcifd function naming read_config() and write_config() are externally visible, so give them more descriptive names. No functional change intended. MFC after: 1 week Sponsored by: Innovate UK
|
#
1d386b48 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
b3e76948 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
Remove $FreeBSD$: two-line .h pattern Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
#
0dea4f06 |
|
11-Jul-2023 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Deduplicate some code in modify_bar_registration() No functional change intended. Reviewed by: corvink, jhb MFC after: 1 week Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D40877
|
#
f4841d8a |
|
28-Jun-2023 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Rename a pci_cfgrw() parameter pci_cfgrw() may be called via a write to the extended config space, which is memory-mapped. In this case, the name "eax" is misleading. Give it a more generic name. No functional change intended. Reviewed by: corvink, jhb MFC after: 1 week Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D40732
|
#
6632a0a4 |
|
16-Aug-2021 |
Corvin Köhne <corvink@FreeBSD.org> |
bhyve: add helper to create a bootorder Qemu's fwcfg allows to define a bootorder. Therefore, the hypervisor has to create a fwcfg item named bootorder, which has a newline seperated list of boot entries. Qemu's OVMF will pick up the bootorder and applies it. Add the moment, bhyve's OVMF doesn't support a custom bootorder by qemu's fwcfg. However, in the future bhyve will gain support for qemu's OVMF. Additonally, we can port relevant parts from qemu's to bhyve's OVMF implementation. Reviewed by: jhb, markj MFC after: 1 week Sponsored by: Beckhoff Automation GmbH & Co. KG Differential Revision: https://reviews.freebsd.org/D39284
|
#
381ef27d |
|
15-May-2023 |
Vitaliy Gusev <gusev.vitaliy@gmail.com> |
bhyve: use pci_next() to save/restore pci devices Current snapshot implementation doesn't support multiple devices with similar type. For example, two virtio-blk or two CD-ROM-s, etc. So the following configuration cannot be restored. bhyve \ -s 3,virtio-blk,disk.img \ -s 4,virtio-blk,disk2.img In some cases it is restored silently, but doesn't work. In some cases it fails during restore stage. This commit fixes that issue. Reviewed by: corvink, rew MFC after: 1 week Sponsored by: vStack Differential Revision: https://reviews.freebsd.org/D40109
|
#
14c80457 |
|
15-May-2023 |
Vitaliy Gusev <gusev.vitaliy@gmail.com> |
bhyve: add bus, slot and func to device name Each device needs a unique identifier to store and restore snapshots properly. Adding the pci bsf information to the device name creates a unique identifier as a bsf can't be occupied twice. Reviewed by: corvink MFC after: 1 week Sponsored by: vStack Differential Revision: https://reviews.freebsd.org/D40107
|
#
4d846d26 |
|
10-May-2023 |
Warner Losh <imp@FreeBSD.org> |
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
|
#
ffaed739 |
|
06-Feb-2023 |
Corvin Köhne <corvink@FreeBSD.org> |
bhyve: add helper to read PCI IDs from bhyve config Changing the PCI IDs is valuable in some situations. The Intel GOP driver requires that some PCI IDs of the LPC bridge are aligned with the physical values of the host LPC bridge. Another use case are oracles virtio driver. They require different subvendor ID than the default one. For that reason, create a helper which makes it easy to read PCI IDs from bhyve config. Additionally, this helper ensures that all emulation devices are using the same config keys. Reviewed by: jhb MFC after: 1 week Sponsored by: Beckhoff Automation GmbH & Co. KG Differential Revision: https://reviews.freebsd.org/D38402
|
#
7d9ef309 |
|
24-Mar-2023 |
John Baldwin <jhb@FreeBSD.org> |
libvmmapi: Add a struct vcpu and use it in most APIs. This replaces the 'struct vm, int vcpuid' tuple passed to most API calls and is similar to the changes recently made in vmm(4) in the kernel. struct vcpu is an opaque type managed by libvmmapi. For now it stores a pointer to the VM context and an integer id. As an immediate effect this removes the divergence between the kernel and userland for the instruction emulation code introduced by the recent vmm(4) changes. Since this is a major change to the vmmapi API, bump VMMAPI_VERSION to 0x200 (2.0) and the shared library major version. While here (and since the major version is bumped), remove unused vcpu argument from vm_setup_pptdev_msi*(). Add new functions vm_suspend_all_cpus() and vm_resume_all_cpus() for use by the debug server. The underyling ioctl (which uses a vcpuid of -1) remains unchanged, but the userlevel API now uses separate functions for global CPU suspend/resume. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D38124
|
#
6a284cac |
|
19-Jan-2023 |
John Baldwin <jhb@FreeBSD.org> |
bhyve: Remove vmctx argument from PCI device model methods. Most of these arguments were unused. Device models which do need access to the vmctx in one of these methods can obtain it from the pi_vmctx member of the pci_devinst argument instead. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D38096
|
#
08b05de1 |
|
09-Dec-2022 |
John Baldwin <jhb@FreeBSD.org> |
bhyve: Remove the unused vcpu argument from all of the I/O port handlers. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37653
|
#
78c2cd83 |
|
09-Dec-2022 |
John Baldwin <jhb@FreeBSD.org> |
bhyve: Remove unused vcpu argument from PCI read/write methods. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37652
|
#
ed721684 |
|
23-Oct-2022 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Address some signed/unsigned comparison warnings MFC after: 1 week
|
#
c9faf698 |
|
22-Oct-2022 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Fix some warnings in the snapshot code - Qualify unexported symbols with "static". - Drop some unnecessary and incorrect casts. - Avoid arithmetic on void pointers. - Avoid signed/unsigned comparisons in loops which use nitems() as a bound. No functional change intended. MFC after: 1 week
|
#
07d82562 |
|
08-Oct-2022 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Make pci_bars local to pci_emul.c MFC after: 1 week
|
#
98d920d9 |
|
08-Oct-2022 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Annotate unused function parameters MFC after: 1 week
|
#
37045dfa |
|
16-Aug-2022 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Mark variables and functions as static where appropriate Mark them const as well when it makes sense to do so. No functional change intended. MFC after: 1 week Sponsored by: The FreeBSD Foundation
|
#
75ce327a |
|
16-Aug-2022 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Use "void" instead of empty parameter lists MFC after: 1 week Sponsored by: The FreeBSD Foundation
|
#
8ac8adda |
|
01-Apr-2022 |
Corvin Köhne <CorvinK@beckhoff.com> |
bhyve: avoid uninitialized variable Reviewed by: markj Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reported-by: Andy Fiddaman <andy@omniosce.org> Differential Revision: https://reviews.freebsd.org/D34688
|
#
45ddbf21 |
|
01-Apr-2022 |
Corvin Köhne <CorvinK@beckhoff.com> |
bhyve: avoid overflow of BAR index At the moment, writes to BAR registers that aren't 4 byte aligned are ignored. So, there's no overflow yet. Nevertheless, if this behaviour changes in the future, it could unintentionally, introduce a buffer overflow. Additionally, some compiler or tools will detect this potential overflow and complain about it. Reviewed by: markj Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reported-by: Andy Fiddaman <andy@omniosce.org> Differential Revision: https://reviews.freebsd.org/D34689
|
#
e47fe318 |
|
10-Mar-2022 |
Corvin Köhne <CorvinK@beckhoff.com> |
bhyve: add ROM emulation Some PCI devices especially GPUs require a ROM to work properly. The ROM is executed by boot firmware to initialize the device. To add a ROM to a device use the new ROM option for passthru device (e.g. -s passthru,0/2/0,rom=<path>/<to>/<rom>). It's necessary that the ROM is executed by the boot firmware. It won't be executed by any OS. Additionally, the boot firmware should be configured to execute the ROM file. For that reason, it's only possible to use a ROM when using OVMF with enabled bus enumeration. Differential Revision: https://reviews.freebsd.org/D33129 Sponsored by: Beckhoff Automation GmbH & Co. KG MFC after: 1 month
|
#
7d55d295 |
|
03-Jan-2022 |
Corvin Köhne <CorvinK@beckhoff.com> |
bhyve: add more slop to 64 bit BARs Bhyve allocates small 64 bit BARs below 4 GB and generates ACPI tables based on this allocation. If the guest decides to relocate those BARs above 4 GB, it could lead to mismatching ACPI tables. Especially when using OVMF with enabled bus enumeration it could cause issues. OVMF relocates all 64 bit BARs above 4 GB. The guest OS may be unable to recover from this situation and disables some PCI devices because their BARs are located outside of the MMIO space reported by ACPI. Avoid this situation by giving the guest more space for relocating BARs. Let's be paranoid. The available space for BARs below 4 GB is 512 MB large. Use a slop of 512 MB. It'll allow the guest to relocate all BARs below 4 GB to an address above 4 GB. We could run into issues when we exceeding the memlimit above 4 GB. However, this space has a size of 32 GB. Even when using many PCI device with large BARs like framebuffer or when using multiple PCI busses, it's very unlikely that we run out of space due to the large slop. Additionally, this situation will occur on startup and not at runtime which is much better. Reviewed by: markj MFC after: 2 weeks Sponsored by: Beckhoff Automation GmbH & Co. KG Differential Revision: https://reviews.freebsd.org/D33118
|
#
01f9362e |
|
03-Jan-2022 |
Corvin Köhne <CorvinK@beckhoff.com> |
bhyve: enumerate BARs by size E.g. Framebuffers can require large space and BARs need to be aligned by their size. If BARs aren't allocated by size, it'll cause much fragmentation of the MMIO space. Reduce fragmentation by ordering the BAR allocation on their size to reduce the risk of OUT_OF_MMIO_SPACE issues. Reviewed by: markj MFC after: 2 weeks Sponsored by: Beckhoff Automation GmbH & Co. KG Differential Revision: https://reviews.freebsd.org/D28278
|
#
c2fa905c |
|
26-Dec-2021 |
Toomas Soome <tsoome@FreeBSD.org> |
bhyve: clean up trailing whitespaces Clean up trailing whitespaces. No functional changes. Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D33681
|
#
fc7207c8 |
|
22-Nov-2021 |
Emmanuel Vadot <manu@FreeBSD.org> |
bhyve: Fix compile We need err.h Fixes: 5cf21e48ccf11 ("bhyve: use a fixed 32 bit BAR base address") Sponsored by: Bechoff Automation GmbH & Co. KG
|
#
5cf21e48 |
|
22-Nov-2021 |
Corvin Köhne <CorvinK@beckhoff.com> |
bhyve: use a fixed 32 bit BAR base address OVMF always uses 0xC0000000 as base address for 32 bit PCI MMIO space. For that reason, we should use that address too. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D31051 Sponsored by: Beckhoff Automation GmbH & Co. KG
|
#
4a4053e1 |
|
22-Nov-2021 |
Corvin Köhne <CorvinK@beckhoff.com> |
bhyve: move 64 bit BAR location to match OVMF assumptions OVMF will fail, if large 64 bit BARs are used. GCD-Map doesn't cover 64 bit addresses of BARs. OVMF assumes that 64 bit addresses of BARS are located on next 32 GB boundary behind Top of High RAM. This patch moves 64 bit BARs on next 32 GB boundary behind Top of High RAM to match OVMF assumptions. Differential Revision: https://reviews.freebsd.org/D27970 Sponsored by: Beckhoff Automation GmbH & Co. KG
|
#
e87a6f3e |
|
18-Nov-2021 |
Corvin Köhne <CorvinK@beckhoff.com> |
bhyve: use physical lobits for BARs of passthru devices Tell the guest whether a BAR uses prefetched memory or not for passthru devices by using the same lobits as the physical device. Reviewed by: grehan Sponsored by: Beckhoff Autmation GmbH & Co. KG Differential Revision: https://reviews.freebsd.org/D32685
|
#
77bc75c7 |
|
16-Oct-2021 |
Mark Johnston <markj@FreeBSD.org> |
bhyve: Fix the WITH_BHYVE_SNAPSHOT build Note, this breaks compatibility with snapshots generated by older builds of bhyve(8). Fixes: 7fa233534736 ("bhyve: Map the MSI-X table unconditionally for passthrough") Reported by: Greg V <greg@unrelenting.technology> Reviewed by: grehan, bz Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32523
|
#
1b0e2f0b |
|
15-Oct-2021 |
Corvin Köhne <CorvinK@beckhoff.com> |
bhyve: ignore low bits of CFGADR Bhyve could emulate wrong PCI registers. In the best case, the guest reads wrong registers and the device driver would report some errors. In the worst case, the guest writes to wrong PCI registers and could brick hardware when using PCI passthrough. According to Intels specification, low bits of CFGADR should be ignored. Some OS like linux may rely on it. Otherwise, bhyve could emulate a wrong PCI register. E.g. If linux would like to read 2 bytes from offset 0x02, following would happen. linux: outl 0x80000002 at CFGADR inw at CFGDAT + 2 bhyve: cfgoff = 0x80000002 & 0xFF = 0x02 coff = cfgoff + (port - CFGDAT) = 0x02 + 0x02 = 0x04 Bhyve would emulate the register at offset 0x04 not 0x02. Reviewed By: #bhyve, grehan Differential Revision: https://reviews.freebsd.org/D31819 Sponsored by: Beckhoff Automation GmbH & Co. KG
|
#
f8a6ec2d |
|
18-Mar-2021 |
D Scott Phillips <scottph@FreeBSD.org> |
bhyve: support relocating fbuf and passthru data BARs We want to allow the UEFI firmware to enumerate and assign addresses to PCI devices so we can boot from NVMe[1]. Address assignment of PCI BARs is properly handled by the PCI emulation code in general, but a few specific cases need additional support. fbuf and passthru map additional objects into the guest physical address space and so need to handle address updates. Here we add a callback to emulated PCI devices to inform them of a BAR configuration change. fbuf and passthru then watch for these BAR changes and relocate the frame buffer memory segment and passthru device mmio area respectively. We also add new VM_MUNMAP_MEMSEG and VM_UNMAP_PPTDEV_MMIO ioctls to vmm(4) to facilitate the unmapping needed for addres updates. [1]: https://github.com/freebsd/uefi-edk2/pull/9/ Originally by: scottph MFC After: 1 week Sponsored by: Intel Corporation Reviewed by: grehan Approved by: philip (mentor) Differential Revision: https://reviews.freebsd.org/D24066
|
#
621b5090 |
|
26-Jun-2019 |
John Baldwin <jhb@FreeBSD.org> |
Refactor configuration management in bhyve. Replace the existing ad-hoc configuration via various global variables with a small database of key-value pairs. The database supports heirarchical keys using a MIB-like syntax to name the path to a given key. Values are always stored as strings. The API used to manage configuation values does include wrappers to handling boolean values. Other values use non-string types require parsing by consumers. The configuration values are stored in a tree using nvlists. Leaf nodes hold string values. Configuration values are permitted to reference other configuration values using '%(name)'. This permits constructing template configurations. All existing command line arguments now set configuration values. For devices, the "-s" option parses its option argument to generate a list of key-value pairs for the given device. A new '-o' command line option permits setting an individual configuration variable. The key name is always given as a full path of dot-separated components. A new '-k' command line option parses a simple configuration file. This configuration file holds a flat list of 'key=value' lines where the 'key' is the full path of a configuration variable. Lines starting with a '#' are comments. In general, bhyve starts by parsing command line options in sequence and applying those settings to configuration values. Once this is complete, bhyve then begins initializing its state based on the configuration values. This means that subsequent configuration options or files may override or supplement previously given settings. A special 'config.dump' configuration value can be set to true to help debug configuration issues. When this value is set, bhyve will print out the configuration variables as a flat list of 'key=value' lines. Most command line argments map to a single configuration variable, e.g. '-w' sets the 'x86.strictmsr' value to false. A few command line arguments have less obvious effects: - Multiple '-p' options append their values (as a comma-seperated list) to "vcpu.N.cpuset" values (where N is a decimal vcpu number). - For '-s' options, a pci.<bus>.<slot>.<function> node is created. The first argument to '-s' (the device type) is used as the value of a "device" variable. Additional comma-separated arguments are then parsed into 'key=value' pairs and used to set additional variables under the device node. A PCI device emulation driver can provide its own hook to override the parsing of the additonal '-s' arguments after the device type. After the configuration phase as completed, the init_pci hook then walks the "pci.<bus>.<slot>.<func>" nodes. It uses the "device" value to find the device model to use. The device model's init routine is passed a reference to its nvlist node in the configuration tree which it can query for specific variables. The result is that a lot of the string parsing is removed from the device models and centralized. In addition, adding a new variable just requires teaching the model to look for the new variable. - For '-l' options, a similar model is used where the string is parsed into values that are later read during initialization. One key note here is that the serial ports use the commonly used lowercase names from existing documentation and examples (e.g. "lpc.com1") instead of the uppercase names previously used internally in bhyve. Reviewed by: grehan MFC after: 3 months Differential Revision: https://reviews.freebsd.org/D26035
|
#
038f5c7b |
|
11-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
bhyve: remove a hack to map all 8G BARs 1:1 Suggested and reviewed by: grehan Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D27186
|
#
670b364b |
|
11-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
bhyve: increase allowed size for 64bit BAR allocation below 4G from 32 to 128 MB. Reviewed by: grehan Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D27095
|
#
9922872b |
|
11-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
bhyve: avoid allocating BARs above the end of supported physical addresses. Read CPUID leaf 0x8000008 to determine max supported phys address and create BAR region right below it, reserving 1/4 of the supported guest physical address space to the 64bit BARs mappings. PR: 250802 (although the issue from PR is not fixed by the change) Noted and reviewed by: grehan Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D27095
|
#
21368498 |
|
25-May-2020 |
Peter Grehan <grehan@FreeBSD.org> |
Fix pci-passthru MSI issues with OpenBSD guests - Return 2 x 16-bit registers in the correct byte order for a 4-byte read that spans the CMD/STATUS register. This reversal was hiding the capabilities-list, which prevented the MSI capability from being found for XHCI passthru. - Reorganize MSI/MSI-x config writes so that a 4-byte write at the capability offset would have the read-only portion skipped. This prevented MSI interrupts from being enabled. Reported and extensively tested by Anatoli (me at anatoli dot ws) PR: 245392 Reported by: Anatoli (me at anatoli dot ws) Reviewed by: jhb (bhyve) Approved by: jhb, bz (mentor) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24951
|
#
483d953a |
|
04-May-2020 |
John Baldwin <jhb@FreeBSD.org> |
Initial support for bhyve save and restore. Save and restore (also known as suspend and resume) permits a snapshot to be taken of a guest's state that can later be resumed. In the current implementation, bhyve(8) creates a UNIX domain socket that is used by bhyvectl(8) to send a request to save a snapshot (and optionally exit after the snapshot has been taken). A snapshot currently consists of two files: the first holds a copy of guest RAM, and the second file holds other guest state such as vCPU register values and device model state. To resume a guest, bhyve(8) must be started with a matching pair of command line arguments to instantiate the same set of device models as well as a pointer to the saved snapshot. While the current implementation is useful for several uses cases, it has a few limitations. The file format for saving the guest state is tied to the ABI of internal bhyve structures and is not self-describing (in that it does not communicate the set of device models present in the system). In addition, the state saved for some device models closely matches the internal data structures which might prove a challenge for compatibility of snapshot files across a range of bhyve versions. The file format also does not currently support versioning of individual chunks of state. As a result, the current file format is not a fixed binary format and future revisions to save and restore will break binary compatiblity of snapshot files. The goal is to move to a more flexible format that adds versioning, etc. and at that point to commit to providing a reasonable level of compatibility. As a result, the current implementation is not enabled by default. It can be enabled via the WITH_BHYVE_SNAPSHOT=yes option for userland builds, and the kernel option BHYVE_SHAPSHOT. Submitted by: Mihai Tiganus, Flavius Anton, Darius Mihai Submitted by: Elena Mihailescu, Mihai Carabas, Sergiu Weisz Relnotes: yes Sponsored by: University Politehnica of Bucharest Sponsored by: Matthew Grooms (student scholarships) Sponsored by: iXsystems Differential Revision: https://reviews.freebsd.org/D19495
|
#
7840d1c4 |
|
27-Apr-2020 |
John Baldwin <jhb@FreeBSD.org> |
Update the cached MSI state when any MSI capability register is written. bhyve uses cached copies of the MSI capability registers to generate MSI interrupts for device models. Previously, these cached fields were only set when the MSI capability control register was updated. The Linux kernel recently adopted a change to deal with races in MSI interrupt delivery that writes to the MSI capability address and data registers to alter the destination of MSI interrupts without writing to the MSI capability control register. bhyve was not updating its cached registers for these writes and continued to send interrupts with the old data value to the old address. Fix this by recomputing the cached values for every write to any MSI capability register. Reported by: Jason Tubnor, Ryan Moeller Reported by: Marc Dionne (bisected the Linux kernel commit) Reviewed by: grehan MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24593
|
#
332eff95 |
|
08-Jan-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
bhyve: add wrapper for debug printf statements Add printf() wrapper to use CR/CRLF terminators depending on whether stdio is mapped to a tty open in raw mode. Try to use the wrapper everywhere. For now we leave the custom DPRINTF/WPRINTF defined by device models, but we may remove them in the future. Reviewed by: grehan, jhb MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D22657
|
#
412d13d5 |
|
24-Oct-2019 |
Jung-uk Kim <jkim@FreeBSD.org> |
Catch up with ACPICA 20191018. PR: 241467 XMFC with: r353764
|
#
0026d8cc |
|
12-Jun-2019 |
John Baldwin <jhb@FreeBSD.org> |
Remove a spurious break when setting up a 64-bit memory BAR. This was causing 'enbit' to not be initialized in this case. CID: 1401924 Reported by: Coverity MFC after: 1 week
|
#
129f93c5 |
|
07-Jun-2019 |
Chuck Tuffli <chuck@FreeBSD.org> |
bhyve: Add PCIe Integrated Endpoint capability The NVMe CAM driver reports the PCIe Link Capability and Status for devices. For emulated bhyve NVMe devices, this looks like: nda0: nvme version 1.3 x63 (max x63) lanes PCIe Gen15 (max Gen15) link The driver outputs this because the emulated device doesn't include the PCIe Capability structure. The NVMe specification requires these registers, so the fix is to add this set of capability registers to the emulated device. Note that PCI Express devices that are integrated into the Root Complex (i.e. Bus 0x0) do not have to support the Link Capability or Status registers. Windows will fail to start (i.e. Code 10) devices that appear to be part of the Root Complex but report being a PCI Express Endpoint. So also add a check to pci_emul_add_pciecap() to check if the device is integrated and change the device type. Reviewed by: imp, ken, araujo, jhb, rgrimes Approved by: imp (mentor), ken (mentor), jhb (maintainer) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D19904
|
#
56282675 |
|
07-Jun-2019 |
John Baldwin <jhb@FreeBSD.org> |
Keep the shadow PCIR_COMMAND synced with the real one for pass through. This ensures that bhyve properly recognizes when decoding is disabled for BARs on passthru devices. To properly handle writes to the register, export a pci_emul_cmd_changed function from pci_emul.c that the pass through device model invokes for config writes that change PCIR_COMMAND. Reviewed by: rgrimes MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D20531
|
#
2729c9bb |
|
07-Jun-2019 |
John Baldwin <jhb@FreeBSD.org> |
Enable memory and I/O decoding in PCI devices on demand. Rather than uncoditionally setting the MEMEN and PORTEN bits in PCIR_COMMAND for PCI devices, set the respective bit when the first BAR of a given type is added to the device. This more closely matches what firmware does on bare metal. BUSMASTEREN is still set unconditionally. Eventually this bit should move into the device models as not all device models need this set. Reviewed by: rgrimes MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D20530
|
#
df61066e |
|
31-May-2019 |
John Baldwin <jhb@FreeBSD.org> |
Whitespace cleanups, no functional change.
|
#
f0dfbccc |
|
13-Apr-2019 |
Chuck Tuffli <chuck@FreeBSD.org> |
Revert r345171 pending review Backing out commit pending further discussion on the PCIe version supported by pseudo (i.e. emulated) devices. See Differential for details. Reviewed by: imp Approved by: imp (mentor) Differential Revision: https://reviews.freebsd.org/D19580
|
#
2ba64075 |
|
14-Mar-2019 |
Chuck Tuffli <chuck@FreeBSD.org> |
Fix bhyve PCIe capability emulation PCIe devices starting with version 1.1 must set the Role-Based Error Reporting bit. And while we're in the neighborhood, generalize the code assigning the device type. Reviewed by: imp, araujo, rgrimes Approved by: imp (mentor) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D19580
|
#
657d2158 |
|
22-Aug-2018 |
Marcelo Araujo <araujo@FreeBSD.org> |
Add -s "help" and -l "help" to print a list of supported PCI and LPC devices. For tools that uses bhyve such like libvirt, it is important to be able to probe what features are supported by the given bhyve binary. To give more context, libvirt probes bhyve's capabilities in a not very effective way: - Running 'bhyve -h' and parsing output. - To detect devices, it runs 'bhyve -s 0,dev' for every each device and parses error output to identify if the device is supported or not. PR: 2101111 Submitted by: novel MFC after: 2 weeks Relnotes: yes Sponsored by: iXsystems Inc.
|
#
f7224b70 |
|
13-Jun-2018 |
Marcelo Araujo <araujo@FreeBSD.org> |
Fix style(9) space vs tab. Reviewed by: jhb MFC after: 3 weeks. Sponsored by: iXsystems Inc. Differential Revision: https://reviews.freebsd.org/D15768
|
#
92046bf1 |
|
22-May-2018 |
Marcelo Araujo <araujo@FreeBSD.org> |
Revert: r334016 Revert for now this change, it in somehow breaks init_pci.
|
#
b5e3928d |
|
21-May-2018 |
Marcelo Araujo <araujo@FreeBSD.org> |
We must free the variable str. Spotted by: clang's static analyzer Submitted by: Tom Rix <trix_juniper.net> Reviewed by: grehan MFC after: 4 weeks Sponsored by: iXsystems Inc. Differential Revision: https://reviews.freebsd.org/D10009
|
#
1de7b4b8 |
|
27-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
various: general adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. No functional change intended.
|
#
1b4496d0 |
|
14-Jul-2016 |
Alexander Motin <mav@FreeBSD.org> |
Make PCI interupts allocation static when using bootrom (UEFI). This makes factual interrupt routing match one shipped with UEFI firmware. With old firmware this make legacy interrupts work reliable for functions 0 of PCI slots 3-6. Updated UEFI image fixes problem completely.
|
#
7e12dfe5 |
|
06-Jul-2016 |
Enji Cooper <ngie@FreeBSD.org> |
Fix CTASSERT issue in a more clean way - Replace all CTASSERT macro instances with static_assert's. - Remove the WRAPPED_CTASSERT macro; it's now an unnecessary obfuscation. - Localize all static_assert's to the structures being tested. - Sort some headers per-style(9). Approved by: re (hrs) Differential Revision: https://reviews.freebsd.org/D7130 MFC after: 1 week X-MFC with: r302364 Reviewed by: ed, grehan (maintainer) Submitted by: ed Sponsored by: EMC / Isilon Storage Division
|
#
edb60334 |
|
05-Jul-2016 |
Enji Cooper <ngie@FreeBSD.org> |
Fix gcc warnings Add `WRAPPED_CTASSERT` macro by annotating CTASSERTs with __unused to deal with -Wunused-local-typedefs warnings from gcc 4.8+. All other compilers (clang, etc) use CTASSERT as-is. A more generic solution for this issue will be proposed after ^/stable/11 is forked. Consolidate all CTASSERTs under one block instead of inlining them in functions. Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D7119 MFC after: 1 week Reported by: Jenkins Reviewed by: grehan (maintainer) Sponsored by: EMC / Isilon Storage Division
|
#
9f3dba68 |
|
13-May-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
bhyve: consider the bogus case of a negative bar idx. This is a followup to r297472 to squelch Coverity. CID: 1194319
|
#
6e43f3ed |
|
31-Mar-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
pci_emul_dior(): fix uninitialized scalar variable. Prevent from returning an unitialized value in case the ior size is unknown. CID: 1194319 Reviewed by: grehan
|
#
d74fdc6a |
|
30-Dec-2015 |
Marcelo Araujo <araujo@FreeBSD.org> |
Clean up unused-but-set-variable spotted by gcc-4.9. Reviewed by: grehan Approved by: bapt (mentor) Differential Revision: https://reviews.freebsd.org/D4735
|
#
463a577b |
|
20-Oct-2015 |
Eitan Adler <eadler@FreeBSD.org> |
Fix a ton of speelling errors arc lint is helpful Reviewed By: allanjude, wblock, #manpages, chris@bsdjunk.com Differential Revision: https://reviews.freebsd.org/D3337
|
#
fd4e0d4c |
|
01-May-2015 |
Neel Natu <neel@FreeBSD.org> |
Advertise an additional memory BAR in the "dummy" device emulation. This is useful for testing the MOVS emulation when both the source and destination addresses are in the MMIO space. MFC after: 1 week
|
#
54335630 |
|
24-Apr-2015 |
Neel Natu <neel@FreeBSD.org> |
Don't allow guest to modify readonly bits in the PCI config 'status' register. Reported by: Leon Dang (ldang@nahannisys.com) MFC after: 2 weeks
|
#
12a6eb99 |
|
07-Aug-2014 |
Neel Natu <neel@FreeBSD.org> |
Support PCI extended config space in bhyve. Add the ACPI MCFG table to advertise the extended config memory window. Introduce a new flag MEM_F_IMMUTABLE for memory ranges that cannot be deleted or moved in the guest's address space. The PCI extended config space is an example of an immutable memory range. Add emulation for the "movzw" instruction. This instruction is used by FreeBSD to read a 16-bit extended config space register. CR: https://phabric.freebsd.org/D505 Reviewed by: jhb, grehan Requested by: tychon
|
#
be679db4 |
|
23-Jun-2014 |
Neel Natu <neel@FreeBSD.org> |
Provide APIs to directly get 'lowmem' and 'highmem' size directly. Previously the sizes were inferred indirectly based on the size of the mappings at 0 and 4GB respectively. This works fine as long as size of the allocation is identical to the size of the mapping in the guest's address space. However, if the mapping is disjoint then this assumption falls apart (e.g., due to the legacy BIOS hole between 640KB and 1MB).
|
#
67b6ffaa |
|
09-Jun-2014 |
Tycho Nightingale <tychon@FreeBSD.org> |
r267169 should apply to 64-bit BARs as well. Reviewed by: neel
|
#
b6ae8b05 |
|
06-Jun-2014 |
Tycho Nightingale <tychon@FreeBSD.org> |
Some devices (e.g. Intel AHCI and NICs) support quad-word access to register pairs where two 32-bit registers make up a larger logical size. Support those access by splitting the quad-word into two double-words. Reviewed by: grehan
|
#
b3e9732a |
|
15-May-2014 |
John Baldwin <jhb@FreeBSD.org> |
Implement a PCI interrupt router to route PCI legacy INTx interrupts to the legacy 8259A PICs. - Implement an ICH-comptabile PCI interrupt router on the lpc device with 8 steerable pins configured via config space access to byte-wide registers at 0x60-63 and 0x68-6b. - For each configured PCI INTx interrupt, route it to both an I/O APIC pin and a PCI interrupt router pin. When a PCI INTx interrupt is asserted, ensure that both pins are asserted. - Provide an initial routing of PCI interrupt router (PIRQ) pins to 8259A pins (ISA IRQs) and initialize the interrupt line config register for the corresponding PCI function with the ISA IRQ as this matches existing hardware. - Add a global _PIC method for OSPM to select the desired interrupt routing configuration. - Update the _PRT methods for PCI bridges to provide both APIC and legacy PRT tables and return the appropriate table based on the configured routing configuration. Note that if the lpc device is not configured, no routing information is provided. - When the lpc device is enabled, provide ACPI PCI link devices corresponding to each PIRQ pin. - Add a VMM ioctl to adjust the trigger mode (edge vs level) for 8259A pins via the ELCR. - Mark the power management SCI as level triggered. - Don't hardcode the number of elements in Packages in the source for the DSDT. iasl(8) will fill in the actual number of elements, and this makes it simpler to generate a Package with a variable number of elements. Reviewed by: tycho
|
#
b100acf2 |
|
01-May-2014 |
Neel Natu <neel@FreeBSD.org> |
Don't allow MPtable generation if there are multiple PCI hierarchies. This is because there isn't a standard way to relay this information to the guest OS. Add a command line option "-Y" to bhyve(8) to inhibit MPtable generation. If the virtual machine is using PCI devices on buses other than 0 then it can still use ACPI tables to convey this information to the guest. Discussed with: grehan@
|
#
fcbec691 |
|
25-Apr-2014 |
Peter Grehan <grehan@FreeBSD.org> |
Respect and track the enable bit in the PCI configuration address word. Ignore writes, and return 0xff's, on config accesses when not set. Behaviour now matches that seen on h/w. Found with a NetBSD/amd64 guest. Reviewed by: tychon MFC after: 3 weeks
|
#
994f858a |
|
22-Apr-2014 |
Xin LI <delphij@FreeBSD.org> |
Use calloc() in favor of malloc + memset. Reviewed by: neel
|
#
7a902ec0 |
|
18-Feb-2014 |
Neel Natu <neel@FreeBSD.org> |
Add a check to validate that memory BARs of passthru devices are 4KB aligned. Also, the MSI-x table offset is not required to be 4KB aligned so take this into account when computing the pages occupied by the MSI-x tables.
|
#
a96b8b80 |
|
17-Feb-2014 |
John Baldwin <jhb@FreeBSD.org> |
Tweak the handling of PCI capabilities in emulated devices to remove the non-standard zero capability list terminator. Instead, track the start and end of the most recently added capability and use that to adjust the previous capability's next pointer when a capability is added and to determine the range of config registers belonging to PCI capability registers. Reviewed by: neel
|
#
d84882ca |
|
14-Feb-2014 |
Neel Natu <neel@FreeBSD.org> |
Allow PCI devices to be configured on all valid bus numbers from 0 to 255. This is done by representing each bus as root PCI device in ACPI. The device implements the _BBN method to return the PCI bus number to the guest OS. Each PCI bus keeps track of the resources that is decodes for devices configured on the bus: i/o, mmio (32-bit) and mmio (64-bit). These windows are advertised to the guest via the _CRS object of the root device. Bus 0 is treated specially since it consumes the I/O ports to access the PCI config space [0xcf8-0xcff]. It also decodes the legacy I/O ports that are consumed by devices on the LPC bus. For this reason the LPC bridge can be configured only on bus 0. The bus number can be specified using the following command line option to bhyve(8): "-s <bus>:<slot>:<func>,<emul>[,<config>]" Discussed with: grehan@ Reviewed by: jhb@
|
#
3cbf3585 |
|
29-Jan-2014 |
John Baldwin <jhb@FreeBSD.org> |
Enhance the support for PCI legacy INTx interrupts and enable them in the virtio backends. - Add a new ioctl to export the count of pins on the I/O APIC from vmm to the hypervisor. - Use pins on the I/O APIC >= 16 for PCI interrupts leaving 0-15 for ISA interrupts. - Populate the MP Table with I/O interrupt entries for any PCI INTx interrupts. - Create a _PRT table under the PCI root bridge in ACPI to route any PCI INTx interrupts appropriately. - Track which INTx interrupts are in use per-slot so that functions that share a slot attempt to distribute their INTx interrupts across the four available pins. - Implicitly mask INTx interrupts if either MSI or MSI-X is enabled and when the INTx DIS bit is set in a function's PCI command register. Either assert or deassert the associated I/O APIC pin when the state of one of those conditions changes. - Add INTx support to the virtio backends. - Always advertise the MSI capability in the virtio backends. Submitted by: neel (7) Reviewed by: neel MFC after: 2 weeks
|
#
d2bc4816 |
|
27-Jan-2014 |
John Baldwin <jhb@FreeBSD.org> |
Remove support for legacy PCI devices. These haven't been needed since support for LPC uart devices was added and it conflicts with upcoming patches to add PCI INTx support. Reviewed by: neel
|
#
e6c8bc29 |
|
02-Jan-2014 |
John Baldwin <jhb@FreeBSD.org> |
Rework the DSDT generation code a bit to generate more accurate info about LPC devices. Among other things, the LPC serial ports now appear as ACPI devices. - Move the info for the top-level PCI bus into the PCI emulation code and add ResourceProducer entries for the memory ranges decoded by the bus for memory BARs. - Add a framework to allow each PCI emulation driver to optionally write an entry into the DSDT under the \_SB_.PCI0 namespace. The LPC driver uses this to write a node for the LPC bus (\_SB_.PCI0.ISA). - Add a linker set to allow any LPC devices to write entries into the DSDT below the LPC node. - Move the existing DSDT block for the RTC to the RTC driver. - Add DSDT nodes for the AT PIC, the 8254 ISA timer, and the LPC UART devices. - Add a "SuperIO" device under the LPC node to claim "system resources" aling with a linker set to allow various drivers to add IO or memory ranges that should be claimed as a system resource. - Add system resource entries for the extended RTC IO range, the registers used for ACPI power management, the ELCR, PCI interrupt routing register, and post data register. - Add various helper routines for generating DSDT entries. Reviewed by: neel (earlier version)
|
#
4f8be175 |
|
16-Dec-2013 |
Neel Natu <neel@FreeBSD.org> |
Add an API to deliver message signalled interrupts to vcpus. This allows callers treat the MSI 'addr' and 'data' fields as opaque and also lets bhyve implement multiple destination modes: physical, flat and clustered. Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com) Reviewed by: grehan@
|
#
ac7304a7 |
|
22-Nov-2013 |
Neel Natu <neel@FreeBSD.org> |
Add an ioctl to assert and deassert an ioapic pin atomically. This will be used to inject edge triggered legacy interrupts into the guest. Start using the new API in device models that use edge triggered interrupts: viz. the 8254 timer and the LPC/uart device emulation. Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
|
#
565bbb86 |
|
12-Nov-2013 |
Neel Natu <neel@FreeBSD.org> |
Move the ioapic device model from userspace into vmm.ko. This is needed for upcoming in-kernel device emulations like the HPET. The ioctls VM_IOAPIC_ASSERT_IRQ and VM_IOAPIC_DEASSERT_IRQ are used to manipulate the ioapic pin state. Discussed with: grehan@ Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
|
#
c8afb9bc |
|
06-Nov-2013 |
Neel Natu <neel@FreeBSD.org> |
Fix an off-by-one error when iterating over the emulated PCI BARs. Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
|
#
ea7f1c8c |
|
28-Oct-2013 |
Neel Natu <neel@FreeBSD.org> |
Add support for PCI-to-ISA LPC bridge emulation. If the LPC bus is attached to a virtual machine then we implicitly create COM1 and COM2 ISA devices. Prior to this change the only way of attaching a COM port to the virtual machine was by presenting it as a PCI device that is mapped at the legacy I/O address 0x3F8 or 0x2F8. There were some issues with the original approach: - It did not work at all with UEFI because UEFI will reprogram the PCI device BARs and remap the COM1/COM2 ports at non-legacy addresses. - OpenBSD GENERIC kernel does not create a /dev/console because it expects the uart device at the legacy 0x3F8/0x2F8 address to be an ISA device. - It was functional with a FreeBSD guest but caused the console to appear on /dev/ttyu2 which was not intuitive. The uart emulation is now independent of the bus on which it resides. Thus it is possible to have uart devices on the PCI bus in addition to the legacy COM1/COM2 devices behind the LPC bus. The command line option to attach ISA COM1/COM2 ports to a virtual machine is "-s <bus>,lpc -l com1,stdio". The command line option to create a PCI-attached uart device is: "-s <bus>,uart[,stdio]" The command line option to create PCI-attached COM1/COM2 device is: "-S <bus>,uart[,stdio]". This style of creating COM ports is deprecated. Discussed with: grehan Reviewed by: grehan Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com) M share/examples/bhyve/vmrun.sh AM usr.sbin/bhyve/legacy_irq.c AM usr.sbin/bhyve/legacy_irq.h M usr.sbin/bhyve/Makefile AM usr.sbin/bhyve/uart_emul.c M usr.sbin/bhyve/bhyverun.c AM usr.sbin/bhyve/uart_emul.h M usr.sbin/bhyve/pci_uart.c M usr.sbin/bhyve/pci_emul.c M usr.sbin/bhyve/inout.c M usr.sbin/bhyve/pci_emul.h M usr.sbin/bhyve/inout.h AM usr.sbin/bhyve/pci_lpc.c AM usr.sbin/bhyve/pci_lpc.h
|
#
2a8d400a |
|
09-Oct-2013 |
Peter Grehan <grehan@FreeBSD.org> |
Allow a 4-byte write to PCI config space to overlap the 2 read-only bytes at the start of a PCI capability. This is the sequence that OpenBSD uses when enabling MSI interrupts, and works fine on real h/w. In bhyve, convert the 4 byte write to a 2-byte write to the r/w area past the first 2 r/o bytes of a capability. Reviewed by: neel Approved by: re@ (blanket)
|
#
318224bb |
|
05-Oct-2013 |
Neel Natu <neel@FreeBSD.org> |
Merge projects/bhyve_npt_pmap into head. Make the amd64/pmap code aware of nested page table mappings used by bhyve guests. This allows bhyve to associate each guest with its own vmspace and deal with nested page faults in the context of that vmspace. This also enables features like accessed/dirty bit tracking, swapping to disk and transparent superpage promotions of guest memory. Guest vmspace: Each bhyve guest has a unique vmspace to represent the physical memory allocated to the guest. Each memory segment allocated by the guest is mapped into the guest's address space via the 'vmspace->vm_map' and is backed by an object of type OBJT_DEFAULT. pmap types: The amd64/pmap now understands two types of pmaps: PT_X86 and PT_EPT. The PT_X86 pmap type is used by the vmspace associated with the host kernel as well as user processes executing on the host. The PT_EPT pmap is used by the vmspace associated with a bhyve guest. Page Table Entries: The EPT page table entries as mostly similar in functionality to regular page table entries although there are some differences in terms of what bits are used to express that functionality. For e.g. the dirty bit is represented by bit 9 in the nested PTE as opposed to bit 6 in the regular x86 PTE. Therefore the bitmask representing the dirty bit is now computed at runtime based on the type of the pmap. Thus PG_M that was previously a macro now becomes a local variable that is initialized at runtime using 'pmap_modified_bit(pmap)'. An additional wrinkle associated with EPT mappings is that older Intel processors don't have hardware support for tracking accessed/dirty bits in the PTE. This means that the amd64/pmap code needs to emulate these bits to provide proper accounting to the VM subsystem. This is achieved by using the following mapping for EPT entries that need emulation of A/D bits: Bit Position Interpreted By PG_V 52 software (accessed bit emulation handler) PG_RW 53 software (dirty bit emulation handler) PG_A 0 hardware (aka EPT_PG_RD) PG_M 1 hardware (aka EPT_PG_WR) The idea to use the mapping listed above for A/D bit emulation came from Alan Cox (alc@). The final difference with respect to x86 PTEs is that some EPT implementations do not support superpage mappings. This is recorded in the 'pm_flags' field of the pmap. TLB invalidation: The amd64/pmap code has a number of ways to do invalidation of mappings that may be cached in the TLB: single page, multiple pages in a range or the entire TLB. All of these funnel into a single EPT invalidation routine called 'pmap_invalidate_ept()'. This routine bumps up the EPT generation number and sends an IPI to the host cpus that are executing the guest's vcpus. On a subsequent entry into the guest it will detect that the EPT has changed and invalidate the mappings from the TLB. Guest memory access: Since the guest memory is no longer wired we need to hold the host physical page that backs the guest physical page before we can access it. The helper functions 'vm_gpa_hold()/vm_gpa_release()' are available for this purpose. PCI passthru: Guest's with PCI passthru devices will wire the entire guest physical address space. The MMIO BAR associated with the passthru device is backed by a vm_object of type OBJT_SG. An IOMMU domain is created only for guest's that have one or more PCI passthru devices attached to them. Limitations: There isn't a way to map a guest physical page without execute permissions. This is because the amd64/pmap code interprets the guest physical mappings as user mappings since they are numerically below VM_MAXUSER_ADDRESS. Since PG_U shares the same bit position as EPT_PG_EXECUTE all guest mappings become automatically executable. Thanks to Alan Cox and Konstantin Belousov for their rigorous code reviews as well as their support and encouragement. Thanks for John Baldwin for reviewing the use of OBJT_SG as the backing object for pci passthru mmio regions. Special thanks to Peter Holm for testing the patch on short notice. Approved by: re Discussed with: grehan Reviewed by: alc, kib Tested by: pho
|
#
6a52209f |
|
27-Aug-2013 |
Neel Natu <neel@FreeBSD.org> |
Allow single byte reads of the emulated MSI-X tables. This is not required by the PCI specification but needed to dump MMIO space from "ddb" in the guest.
|
#
50dc0db3 |
|
15-Aug-2013 |
Peter Grehan <grehan@FreeBSD.org> |
Fix ordering of legacy IRQ reservations. Submitted by: Jeremiah Lott jlott at averesystems dot com
|
#
a38e2a64 |
|
03-Jul-2013 |
Peter Grehan <grehan@FreeBSD.org> |
Support an optional "mac=" parameter to virtio-net config, to allow users to set the MAC address for a device. Clean up some obsolete code in pci_virtio_net.c Allow an error return from a PCI device emulation's init routine to be propagated all the way back to the top-level and result in the process exiting. Submitted by: Dinakar Medavaram dinnu sun at gmail (original version)
|
#
34d244ed |
|
01-Jul-2013 |
Peter Grehan <grehan@FreeBSD.org> |
Fix up option parsing to allow a colon in the config section. Clean up some other unnecessary code. Submitted by: Dinakar Medavaram dinnu sun at gmail Reviewed by: neel
|
#
75543036 |
|
27-Jun-2013 |
Peter Grehan <grehan@FreeBSD.org> |
Allow the PCI config address register to be read. The Linux kernel does this. Also remove an unused header file. Submitted by: tycho nightingale at pluribusnetworks com Reviewed by: neel
|
#
b05c77ff |
|
25-Apr-2013 |
Neel Natu <neel@FreeBSD.org> |
Gripe if some <slot,function> tuple is specified more than once instead of silently overwriting the previous assignment. Gripe if the emulation is not recognized instead of silently ignoring the emulated device. If an error is detected by pci_parse_slot() then exit from the command line parsing loop in main(). Submitted by (initial version): Chris Torek (chris.torek@gmail.com)
|
#
9f08548d |
|
16-Apr-2013 |
Neel Natu <neel@FreeBSD.org> |
Setup accesses to the memory hole below 4GB to return all 1's on read and consume all writes without any side effects. Obtained from: NetApp
|
#
028d9311 |
|
09-Apr-2013 |
Neel Natu <neel@FreeBSD.org> |
Improve PCI BAR emulation: - Respect the MEMEN and PORTEN bits in the command register - Allow the guest to reprogram the address decoded by the BAR Submitted by: Gopakumar T Obtained from: NetApp
|
#
b060ba50 |
|
18-Mar-2013 |
Neel Natu <neel@FreeBSD.org> |
Simplify the assignment of memory to virtual machines by requiring a single command line option "-m <memsize in MB>" to specify the memory size. Prior to this change the user needed to explicitly specify the amount of memory allocated below 4G (-m <lowmem>) and the amount above 4G (-M <highmem>). The "-M" option is no longer supported by 'bhyveload' and 'bhyve'. The start of the PCI hole is fixed at 3GB and cannot be directly changed using command line options. However it is still possible to change this in special circumstances via the 'vm_set_lowmem_limit()' API provided by libvmmapi. Submitted by: Dinakar Medavaram (initial version) Reviewed by: grehan Obtained from: NetApp
|
#
0ab13648 |
|
21-Feb-2013 |
Peter Grehan <grehan@FreeBSD.org> |
Add the ability to have a 'fallback' search for memory ranges. These set of ranges will be looked at if a standard memory range isn't found, and won't be installed in the cache. Use this to implement the memory behaviour of the PCI hole on x86 systems, where writes are ignored and reads always return -1. This allows breakpoints to be set when issuing a 'boot -d', which has the side effect of accessing the PCI hole when changing the PTE protection on kernel code, since the pmap layer hasn't been initialized (a bug, but present in existing FreeBSD releases so has to be handled). Reviewed by: neel Obtained from: NetApp
|
#
74f80b23 |
|
15-Feb-2013 |
Neel Natu <neel@FreeBSD.org> |
Advertise PCI-E capability in the hostbridge device presented to the guest. FreeBSD wants to see this capability in at least one device in the PCI hierarchy before it allows use of MSI or MSI-X. Obtained from: NetApp
|
#
aa12663f |
|
31-Jan-2013 |
Neel Natu <neel@FreeBSD.org> |
Fix a bug in the passthru implementation where it would assume that all devices are MSI-X capable. This in turn would lead it to treat bar 0 as the MSI-X table bar even if the underlying device did not support MSI-X. Fix this by providing an API to query the MSI-X table index of the emulated device. If the underlying device does not support MSI-X then this API will return -1. Obtained from: NetApp
|
#
c9b4e987 |
|
29-Jan-2013 |
Neel Natu <neel@FreeBSD.org> |
Add support for MSI-X interrupts in the virtio network device and make that the default. The current behavior of advertising a single MSI vector can be requested by setting the environment variable "BHYVE_USE_MSI" to "true". The use of MSI is not compliant with the virtio specification and will be eventually phased out. Submitted by: Gopakumar T Obtained from: NetApp
|
#
e285ef8d |
|
12-Dec-2012 |
Peter Grehan <grehan@FreeBSD.org> |
Rename fbsdrun.* -> bhyverun.* bhyve is intended to be a generic hypervisor, and not FreeBSD-specific. (renaming internal routines will come later) Reviewed by: neel Obtained from: NetApp
|
#
90415e0b |
|
26-Oct-2012 |
Neel Natu <neel@FreeBSD.org> |
Ignore PCI configuration accesses to all bus numbers other than PCI bus 0. Obtained from: NetApp
|
#
fbfc1c76 |
|
26-Oct-2012 |
Peter Grehan <grehan@FreeBSD.org> |
Remove mptable generation code from libvmmapi and move it to bhyve. Firmware tables require too much knowledge of system configuration, and it's difficult to pass that information in general terms to a library. The upcoming ACPI work exposed this - it will also livein bhyve. Also, remove code specific to NetApp from the mptable name, and remove the -n option from bhyve. Reviewed by: neel Obtained from: NetApp
|
#
4d1e669c |
|
19-Oct-2012 |
Peter Grehan <grehan@FreeBSD.org> |
Rework how guest MMIO regions are dealt with. - New memory region interface. An RB tree holds the regions, with a last-found per-vCPU cache to deal with the common case of repeated guest accesses to MMIO registers in the same page. - Support memory-mapped BARs in PCI emulation. mem.c/h - memory region interface instruction_emul.c/h - remove old region interface. Use gpa from EPT exit to avoid a tablewalk to determine operand address. Determine operand size and use when calling through to region handler. fbsdrun.c - call into region interface on paging exit. Distinguish between instruction emul error and region not found pci_emul.c/h - implement new BAR callback api. Split BAR alloc routine into routines that require/don't require the BAR phys address. ioapic.c pci_passthru.c pci_virtio_block.c pci_virtio_net.c pci_uart.c - update to new BAR callback i/f Reviewed by: neel Obtained from: NetApp
|
#
25d4944e |
|
06-Aug-2012 |
Neel Natu <neel@FreeBSD.org> |
Fix a bug in how a 64-bit bar in a pci passthru device would be presented to the guest. Prior to the fix it was possible for such a bar to appear as a 32-bit bar as long as it was allocated from the region below 4GB. This had the potential to confuse some drivers that were particular about the size of the bars. Obtained from: NetApp
|
#
99d65389 |
|
06-Aug-2012 |
Neel Natu <neel@FreeBSD.org> |
Add support for emulating PCI multi-function devices. These function number is specified by an optional [:<func>] after the slot number: -s 1:0,virtio-net,tap0 Ditto for the mptable naming: -n 1:0,e0a Obtained from: NetApp
|
#
b0b53d3a |
|
04-Aug-2012 |
Neel Natu <neel@FreeBSD.org> |
Device model for ioapic emulation. With this change the uart emulation is entirely interrupt driven. Obtained from: NetApp
|
#
308f9077 |
|
03-Aug-2012 |
Neel Natu <neel@FreeBSD.org> |
Use the correct variable to index into the 'lirq[]' array to check the legacy IRQ ownership.
|
#
0038ee98 |
|
02-May-2012 |
Peter Grehan <grehan@FreeBSD.org> |
Add 16550 uart emulation as a PCI device. This allows it to be activated as part of the slot config options. The syntax is: -s <slotnum>,uart[,stdio] The stdio parameter instructs the code to perform i/o using stdin/stdout. It can only be used for one instance. To allow legacy i/o ports/irqs to be used, a new variant of the slot command, -S, is introduced. When used to specify a slot, the device will use legacy resources if it supports them; otherwise it will be treated the same as the '-s' option. Specifying the -S option with the uart will first use the 0x3f8/irq 4 config, and the second -S will use 0x2F8/irq 3. Interrupt delivery is awaiting the arrival of the i/o apic code, but this works fine in uart(4)'s polled mode. This code was written by Cynthia Lu @ MIT while an intern at NetApp, with further work from neel@ and grehan@. Obtained from: NetApp
|
#
cd942e0f |
|
28-Apr-2012 |
Peter Grehan <grehan@FreeBSD.org> |
MSI-x interrupt support for PCI pass-thru devices. Includes instruction emulation for memory r/w access. This opens the door for io-apic, local apic, hpet timer, and legacy device emulation. Submitted by: ryan dot berryhill at sandvine dot com Reviewed by: grehan Obtained from: Sandvine
|
#
366f6083 |
|
12-May-2011 |
Peter Grehan <grehan@FreeBSD.org> |
Import of bhyve hypervisor and utilities, part 1. vmm.ko - kernel module for VT-x, VT-d and hypervisor control bhyve - user-space sequencer and i/o emulation vmmctl - dump of hypervisor register state libvmm - front-end to vmm.ko chardev interface bhyve was designed and implemented by Neel Natu. Thanks to the following folk from NetApp who helped to make this available: Joe CaraDonna Peter Snyder Jeff Heller Sandeep Mann Steve Miller Brian Pawlowski
|