#
56a26fc1 |
|
03-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Conditionalize compilation of some functions Hide definitions of several functions that currently don't have implementatations in the arm64 vmm port. In particular, add a WITH_VMMAPI_SNAPSHOT preprocessor variable that can be used to enable compilation of save/restore functions, and conditionalize compilation of some functions only used by amd64 bhyve. If in the long term they remain amd64-only, they can move to vmmapi_machdep.c, but for now it's not clear to me that that's the right thing to do. MFC after: 2 weeks Sponsored by: Innovate UK
|
#
e499fdcb |
|
03-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Zero out the structure passed to VM_GET_MEMSEG Avoid assuming that the kernel zeros the name buffer, it does not do this for zero-length segments. MFC after: 2 weeks Sponsored by: Innovate UK
|
#
5ec6c300 |
|
03-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Add arm64 support - Define wrappers for some MD ioctls. - Provide a list of vmm device ioctls for cap_ioctl_limit(). - Disable use of the lowmem region. Reviewed by: corvink MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41005
|
#
7e0fa794 |
|
03-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Make memory segment handling a bit more abstract libvmmapi leaves a hole at [3GB, 4GB) in the guest physical address space. This hole is not used in the arm64 port, which maps everything above 4GB. This change makes the code a bit more general to accomodate arm64 more naturally. In particular: - Remove vm_set_lowmem_limit(): it is unused and doesn't have well-defined constraints, e.g., nothing prevents a consumer from setting a lowmem limit above the highmem base. - Define a constant for the highmem base and use that everywhere that the base is currently hard-coded. - Make the lowmem limit a compile-time constant instead of a vmctx field. - Store segment info in an array. - Add vm_get_highmem_base(), for use in bhyve since the current value is hard-coded in some places. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41004
|
#
8b06bdc9 |
|
02-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Move PCI passthrough ioctl wrappers into a separate file The arm64 port doesn't implement PCI passthrough and in particular doesn't define the ioctls used by these wrappers. It might be that the ppt ioctl interface will require modification to support arm64. Until that's sorted out one way or another, put this code in a separate file so that it's easy to conditionally compile. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41003
|
#
3170dcae |
|
02-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Move more amd64-specific ioctl wrappers to vmmapi_machdep.c No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41002
|
#
7f00e46b |
|
02-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Split the ioctl list into MI and MD lists To enable use in capability mode, libvmmapi needs a list of all the ioctls that might be invoked on the vmm device handle. Some of these ioctls are amd64-specific. Move the ioctl list to vmmapi_machdep.c and define a list of MI ioctls so that the arm64 port can build its own list without duplicating common ioctls. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41001
|
#
85efb31d |
|
02-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Move VM capability names to vmmapi_machdep.c Add some missing entries while here. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41000
|
#
e4656e10 |
|
02-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Move some ioctl wrappers to vmmapi_machdep.c ioctls relating to segments and various x86-specific interrupt controllers are easy candidates to move to vmmapi_machdep.c. In vmmapi.h I'm just ifdefing MD prototypes for now. We could instead split vmmapi.h into multiple headers, e.g., vmmapi.h and vmmapi_machdep.h, but it's not obvious to me yet that that's the right approach. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D40999
|
#
967264cf |
|
02-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Add a subdirectory for amd64-specific code Move vmmapi_freebsd.c there. It contains x86-specific code used only by bhyveload(8). Move vcpu_reset() into vmmapi_machdep.c. It is also x86-specific. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D40998
|
#
b6c7ff58 |
|
08-Apr-2024 |
Rob Norris <robn@despairlabs.com> |
libvmmapi: add missing capability strings Signed-off-by: Rob Norris <robn@despairlabs.com> Reviewed by: markj MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D44642
|
#
a2f733ab |
|
24-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
lib: Automated cleanup of cdefs and other formatting Apply the following automated changes to try to eliminate no-longer-needed sys/cdefs.h includes as well as now-empty blank lines in a row. Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/ Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/ Remove /\n+#if.*\n#endif.*\n+/ Remove /^#if.*\n#endif.*\n/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/ Sponsored by: Netflix
|
#
1d386b48 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
b3e76948 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
Remove $FreeBSD$: two-line .h pattern Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
#
1320520b |
|
08-Jun-2023 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Remove some unneeded includes These are amd64-specific and so can't be used when targetting arm64, but they don't appear to be needed. No functional change intended. MFC after: 1 week Sponsored by: The FreeBSD Foundation
|
#
e17eca32 |
|
23-May-2023 |
Mark Johnston <markj@FreeBSD.org> |
vmm: Avoid embedding cpuset_t ioctl ABIs Commit 0bda8d3e9f7a ("vmm: permit some IPIs to be handled by userspace") embedded cpuset_t into the vmm(4) ioctl ABI. This was a mistake since we otherwise have some leeway to change the cpuset_t for the whole system, but we want to keep the vmm ioctl ABI stable. Rework IPI reporting to avoid this problem. Along the way, make VM_RUN a bit more efficient: - Split vmexit metadata out of the main VM_RUN structure. This data is only written by the kernel. - Have userspace pass a cpuset_t pointer and cpusetsize in the VM_RUN structure, as is done for cpuset syscalls. - Have the destination CPU mask for VM_EXITCODE_IPIs live outside the vmexit info structure, and make VM_RUN copy it out separately. Zero out any extra bytes in the CPU mask, like cpuset syscalls do. - Modify the vmexit handler prototype to take a full VM_RUN structure. PR: 271330 Reviewed by: corvink, jhb (previous versions) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D40113
|
#
4d846d26 |
|
10-May-2023 |
Warner Losh <imp@FreeBSD.org> |
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
|
#
0f735657 |
|
24-Mar-2023 |
John Baldwin <jhb@FreeBSD.org> |
bhyve: Remove vmctx member from struct vm_snapshot_meta. This is a userland-only pointer that isn't relevant to the kernel and doesn't belong in the ioctl structure shared between userland and the kernel. For the kernel, the old structure for the ioctl is still supported under COMPAT_FREEBSD13. This changes vm_snapshot_req() in libvmmapi to accept an explicit vmctx argument. It also changes vm_snapshot_guest2host_addr to take an explicit vmctx argument. As part of this change, move the declaration for this function and its wrapper macro from vmm_snapshot.h to snapshot.h as it is a userland-only API. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D38125
|
#
7d9ef309 |
|
24-Mar-2023 |
John Baldwin <jhb@FreeBSD.org> |
libvmmapi: Add a struct vcpu and use it in most APIs. This replaces the 'struct vm, int vcpuid' tuple passed to most API calls and is similar to the changes recently made in vmm(4) in the kernel. struct vcpu is an opaque type managed by libvmmapi. For now it stores a pointer to the VM context and an integer id. As an immediate effect this removes the divergence between the kernel and userland for the instruction emulation code introduced by the recent vmm(4) changes. Since this is a major change to the vmmapi API, bump VMMAPI_VERSION to 0x200 (2.0) and the shared library major version. While here (and since the major version is bumped), remove unused vcpu argument from vm_setup_pptdev_msi*(). Add new functions vm_suspend_all_cpus() and vm_resume_all_cpus() for use by the debug server. The underyling ioctl (which uses a vcpuid of -1) remains unchanged, but the userlevel API now uses separate functions for global CPU suspend/resume. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D38124
|
#
755dcd5e |
|
05-Mar-2023 |
Vitaliy Gusev <gusev.vitaliy@gmail.com> |
libvmm: add missing ioctl's to vm_ioctl_cmds Reviewed by: corvink, markj MFC after: 1 week Sponsored by: vStack Differential Revision: https://reviews.freebsd.org/D38866
|
#
d3956e46 |
|
18-Nov-2022 |
John Baldwin <jhb@FreeBSD.org> |
vmm: Use struct vcpu in the instruction emulation code. This passes struct vcpu down in place of struct vm and and integer vcpu index through the in-kernel instruction emulation code. To minimize userland disruption, helper macros are used for the vCPU arguments passed into and through the shared instruction emulation code. A few other APIs used by the instruction emulation code have also been updated to accept struct vcpu in the kernel including vm_get/set_register and vm_inject_fault. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37161
|
#
2b4fe856 |
|
18-Nov-2022 |
John Baldwin <jhb@FreeBSD.org> |
bhyve: Remove unused vm and vcpu arguments from vm_copy routines. The arguments identifying the VM and vCPU are only needed for vm_copy_setup. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37158
|
#
3e9b4532 |
|
24-Oct-2022 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Provide an interface for limiting rights on the device fd Currently libvmmapi provides a way to get a list of the allowed ioctls on the vmm device file, so that bhyve can limit rights on the device file fd. The interface is rather strange: it allocates a copy of the list but returns a const pointer, so the caller has to cast away the const in order to free it without aggravating the compiler. As far as I can see, there's no reason to make a copy of the array, but changing vm_get_ioctls() to not do that would break compatibility. So this change just introduces a better interface: move all rights-limiting logic into libvmmapi. Any new operations on the fd should be wrapped by libvmmapi, so also discourage use of vm_get_device_fd(). Currently bhyve uses it only when limiting rights on the device fd. No functional change intended. Reviewed by: jhb MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D37098
|
#
56b8f5b1 |
|
27-Jul-2022 |
Corvin Köhne <CorvinK@beckhoff.com> |
bhyve: Initialize more registers in vcpu_reset() - Clear CR2, EFER, and R8-15 to zero. - Reset DR6 and DR7 to their documented reset values. - Reset interrupt shadow state. - Document the reason CR0 is reset to a value that doesn't match its documented value. Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D35622 Sponsored by: Beckhoff Automation GmbH & Co. KG
|
#
f0880ab7 |
|
30-Jun-2022 |
Vitaliy Gusev <gusev.vitaliy@gmail.com> |
libvmmapi: Add vm_close() Currently there is no way to safely free a vm structure without leaking the fd. vm_destroy() closes the fd but also destroys the VM whereas in some cases a VM needs to be opened (vm_open) and then closed (vm_close). Reviewed by: jhb Sponsored by: vStack Differential Revision: https://reviews.freebsd.org/D35073
|
#
3efc45f3 |
|
17-Mar-2022 |
Robert Wing <rew@FreeBSD.org> |
libvmm: constify vm_get_name() Allows callers of vm_get_name() to retrieve the vm name without having to allocate a buffer. While in the vicinity, do minor cleanup in vm_snapshot_basic_metadata(). Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D34290
|
#
64269786 |
|
07-Feb-2022 |
John Baldwin <jhb@FreeBSD.org> |
Extend the VMM stats interface to support a dynamic count of statistics. - Add a starting index to 'struct vmstats' and change the VM_STATS ioctl to fetch the 64 stats starting at that index. A compat shim for <= 13 continues to fetch only the first 64 stats. - Extend vm_get_stats() in libvmmapi to use a loop and a static thread local buffer which grows to hold the stats needed. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D27463
|
#
5ac4ac85 |
|
15-Sep-2021 |
John Baldwin <jhb@FreeBSD.org> |
Remove an always-true check. This fixes a -Wtype-limits error from GCC 9. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D31936
|
#
45cd18ec |
|
26-Jul-2021 |
Mark Johnston <markj@FreeBSD.org> |
libvmmapi: Fix warnings and stop overridding WARNS - Avoid shadowing the global optarg. - Sprinkle __unused. - Cast nitems() to int. - Fix sign in vm_copy_setup(). Reviewed by: grehan MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31306
|
#
a7f81b48 |
|
11-Mar-2021 |
Robert Wing <rew@FreeBSD.org> |
libvmm: explicitly save and restore errno in vm_open() In commit 6bb140e3ca895a14, vm_destroy() was replaced with free() to preserve errno. However, it's possible that free() may change the errno as well. Keep the free() call, but explicitly save and restore errno. Noted by: jhb Fixes: 6bb140e3ca895a14
|
#
f8a6ec2d |
|
18-Mar-2021 |
D Scott Phillips <scottph@FreeBSD.org> |
bhyve: support relocating fbuf and passthru data BARs We want to allow the UEFI firmware to enumerate and assign addresses to PCI devices so we can boot from NVMe[1]. Address assignment of PCI BARs is properly handled by the PCI emulation code in general, but a few specific cases need additional support. fbuf and passthru map additional objects into the guest physical address space and so need to handle address updates. Here we add a callback to emulated PCI devices to inform them of a BAR configuration change. fbuf and passthru then watch for these BAR changes and relocate the frame buffer memory segment and passthru device mmio area respectively. We also add new VM_MUNMAP_MEMSEG and VM_UNMAP_PPTDEV_MMIO ioctls to vmm(4) to facilitate the unmapping needed for addres updates. [1]: https://github.com/freebsd/uefi-edk2/pull/9/ Originally by: scottph MFC After: 1 week Sponsored by: Intel Corporation Reviewed by: grehan Approved by: philip (mentor) Differential Revision: https://reviews.freebsd.org/D24066
|
#
6bb140e3 |
|
06-Mar-2021 |
Robert Wing <rew@FreeBSD.org> |
bhyvectl: print a better error message when vm_open() fails Use errno to print a more descriptive error message when vm_open() fails libvmm: preserve errno when vm_device_open() fails vm_destroy() squashes errno by making a dive into sysctlbyname() - we can safely skip vm_destroy() here since it's not doing any critical clean up at this point. Replace vm_destroy() with a free() call. PR: 250671 MFC after: 3 days Submitted by: marko@apache.org Reviewed by: grehan Differential Revision: https://reviews.freebsd.org/D29109
|
#
1925586e |
|
24-Nov-2020 |
John Baldwin <jhb@FreeBSD.org> |
Honor the disabled setting for MSI-X interrupts for passthrough devices. Add a new ioctl to disable all MSI-X interrupts for a PCI passthrough device and invoke it if a write to the MSI-X capability registers disables MSI-X. This avoids leaving MSI-X interrupts enabled on the host if a guest device driver has disabled them (e.g. as part of detaching a guest device driver). This was found by Chelsio QA when testing that a Linux guest could switch from MSI-X to MSI interrupts when using the cxgb4vf driver. While here, explicitly fail requests to enable MSI on a passthrough device if MSI-X is enabled and vice versa. Reported by: Sony Arpita Das @ Chelsio Reviewed by: grehan, markj MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D27212
|
#
8a68ae80 |
|
15-May-2020 |
Conrad Meyer <cem@FreeBSD.org> |
vmm(4), bhyve(8): Expose kernel-emulated special devices to userspace Expose the special kernel LAPIC, IOAPIC, and HPET devices to userspace for use in, e.g., fallback instruction emulation (when userspace has a newer instruction decode/emulation layer than the kernel vmm(4)). Plumb the ioctl through libvmmapi and register the memory ranges in bhyve(8). Reviewed by: grehan Differential Revision: https://reviews.freebsd.org/D24525
|
#
483d953a |
|
04-May-2020 |
John Baldwin <jhb@FreeBSD.org> |
Initial support for bhyve save and restore. Save and restore (also known as suspend and resume) permits a snapshot to be taken of a guest's state that can later be resumed. In the current implementation, bhyve(8) creates a UNIX domain socket that is used by bhyvectl(8) to send a request to save a snapshot (and optionally exit after the snapshot has been taken). A snapshot currently consists of two files: the first holds a copy of guest RAM, and the second file holds other guest state such as vCPU register values and device model state. To resume a guest, bhyve(8) must be started with a matching pair of command line arguments to instantiate the same set of device models as well as a pointer to the saved snapshot. While the current implementation is useful for several uses cases, it has a few limitations. The file format for saving the guest state is tied to the ABI of internal bhyve structures and is not self-describing (in that it does not communicate the set of device models present in the system). In addition, the state saved for some device models closely matches the internal data structures which might prove a challenge for compatibility of snapshot files across a range of bhyve versions. The file format also does not currently support versioning of individual chunks of state. As a result, the current file format is not a fixed binary format and future revisions to save and restore will break binary compatiblity of snapshot files. The goal is to move to a more flexible format that adds versioning, etc. and at that point to commit to providing a reasonable level of compatibility. As a result, the current implementation is not enabled by default. It can be enabled via the WITH_BHYVE_SNAPSHOT=yes option for userland builds, and the kernel option BHYVE_SHAPSHOT. Submitted by: Mihai Tiganus, Flavius Anton, Darius Mihai Submitted by: Elena Mihailescu, Mihai Carabas, Sergiu Weisz Relnotes: yes Sponsored by: University Politehnica of Bucharest Sponsored by: Matthew Grooms (student scholarships) Sponsored by: iXsystems Differential Revision: https://reviews.freebsd.org/D19495
|
#
24c2e17d |
|
21-Apr-2020 |
John Baldwin <jhb@FreeBSD.org> |
Map negative types passed to vm_capability_type2name to NULL. Submitted by: vangyzen
|
#
d000623a |
|
21-Apr-2020 |
John Baldwin <jhb@FreeBSD.org> |
Add description string for VM_CAP_BPT_EXIT. While here, replace the array of mapping structures with an array of string pointers where the index is the capability value. Submitted by: Rob Fairbanks <rob.fx907@gmail.com> Reviewed by: rgrimes MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24289
|
#
af1e30f8 |
|
16-Dec-2019 |
Marcelo Araujo <araujo@FreeBSD.org> |
Forgotten to remove the previous if statement in commit r355838. MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D19400
|
#
a71dc724 |
|
16-Dec-2019 |
Marcelo Araujo <araujo@FreeBSD.org> |
Attempt to load vmm(4) module before creating a guest using vm_create() wrapper in libvmmapi. Submitted by: Rob Fairbanks <rob.fx907_gmail.com> Reviewed by: jhb MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D19400
|
#
6a9648b5 |
|
06-Sep-2018 |
John Baldwin <jhb@FreeBSD.org> |
bhyve: Use MAP_GUARD when mapping guest memory ranges. Instead of relying on PROT_NONE mappings with MAP_ANON, use MAP_GUARD to reserve address space around guest memory ranges including the guard ranges of address space around mappings. Submitted by: Shawn Webb Reviewed by: araujo Approved by: re (rgrimes) MFC after: 1 month Sponsored by: HardendBSD and G2, Inc Differential Revision: https://reviews.freebsd.org/D16822
|
#
23fe789d |
|
13-Jun-2018 |
Marcelo Araujo <araujo@FreeBSD.org> |
Fix style(9) space vs tab. Reviewed by: jhb MFC after: 3 weeks. Sponsored by: iXsystems Inc. Differential Revision: https://reviews.freebsd.org/D15774
|
#
01d822d3 |
|
08-Apr-2018 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
Add the ability to control the CPU topology of created VMs from userland without the need to use sysctls, it allows the old sysctls to continue to function, but deprecates them at FreeBSD_version 1200060 (Relnotes for deprecate). The command line of bhyve is maintained in a backwards compatible way. The API of libvmmapi is maintained in a backwards compatible way. The sysctl's are maintained in a backwards compatible way. Added command option looks like: bhyve -c [[cpus=]n][,sockets=n][,cores=n][,threads=n][,maxcpus=n] The optional parts can be specified in any order, but only a single integer invokes the backwards compatible parse. [,maxcpus=n] is hidden by #ifdef until kernel support is added, though the api is put in place. bhyvectl --get-cpu-topology option added. Reviewed by: grehan (maintainer, earlier version), Reviewed by: bcr (manpages) Approved by: bde (mentor), phk (mentor) Tested by: Oleg Ginzburg <olevole@olevole.ru> (cbsd) MFC after: 1 week Relnotes: Y Differential Revision: https://reviews.freebsd.org/D9930
|
#
fc276d92 |
|
06-Apr-2018 |
John Baldwin <jhb@FreeBSD.org> |
Add a way to temporarily suspend and resume virtual CPUs. This is used as part of implementing run control in bhyve's debug server. The hypervisor now maintains a set of "debugged" CPUs. Attempting to run a debugged CPU will fail to execute any guest instructions and will instead report a VM_EXITCODE_DEBUG exit to the userland hypervisor. Virtual CPUs are placed into the debugged state via vm_suspend_cpu() (implemented via a new VM_SUSPEND_CPU ioctl). Virtual CPUs can be resumed via vm_resume_cpu() (VM_RESUME_CPU ioctl). The debug server suspends virtual CPUs when it wishes them to stop executing in the guest (for example, when a debugger attaches to the server). The debug server can choose to resume only a subset of CPUs (for example, when single stepping) or it can choose to resume all CPUs. The debug server must explicitly mark a CPU as resumed via vm_resume_cpu() before the virtual CPU will successfully execute any guest instructions. Reviewed by: avg, grehan Tested on: Intel (jhb), AMD (avg) Differential Revision: https://reviews.freebsd.org/D14466
|
#
5f8754c0 |
|
26-Feb-2018 |
John Baldwin <jhb@FreeBSD.org> |
Add a new variant of the GLA2GPA ioctl for use by the debug server. Unlike the existing GLA2GPA ioctl, GLA2GPA_NOFAULT does not modify the guest. In particular, it does not inject any faults or modify PTEs in the guest when performing an address space translation. This is used by bhyve's debug server to read and write memory for the remote debugger. Reviewed by: grehan MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D14075
|
#
4f866698 |
|
21-Feb-2018 |
John Baldwin <jhb@FreeBSD.org> |
Add two new ioctls to bhyve for batch register fetch/store operations. These are a convenience for bhyve's debug server to use a single ioctl for 'g' and 'G' rather than a loop of individual get/set ioctl requests. Reviewed by: grehan MFC after: 2 months Differential Revision: https://reviews.freebsd.org/D14074
|
#
5e53a4f9 |
|
25-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
lib: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using mis-identified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.
|
#
00ef17be |
|
14-Feb-2017 |
Bartek Rutkowski <robak@FreeBSD.org> |
Capsicum support for bhyve(8). Adds Capsicum sandboxing to bhyve. Submitted by: Pawel Biernacki <pawel.biernacki@gmail.com> Reviewed by: grehan, oshogbo Approved by: emaste, grehan Sponsored by: Mysterious Code Ltd. Differential Revision: https://reviews.freebsd.org/D8290
|
#
edc816d6 |
|
06-Dec-2016 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Fix possible integer overflow in guest memory bounds checking, which could lead to access from the virtual machine to the heap of the bhyve(8) process. Submitted by: Felix Wilhelm <fwilhelm ernw.de> Patch by: grehan Security: FreeBSD-SA-16:38.bhyve
|
#
d6849317 |
|
22-Feb-2016 |
Svatopluk Kraus <skra@FreeBSD.org> |
As <machine/param.h> is included from <sys/param.h>, there is no need to include it explicitly when <sys/param.h> is already included. Reviewed by: alc, kib Differential Revision: https://reviews.freebsd.org/D5378
|
#
5e4f29c0 |
|
06-Jul-2015 |
Neel Natu <neel@FreeBSD.org> |
Move the 'devmem' device nodes from /dev/vmm to /dev/vmm.io Some external tools just do a 'ls /dev/vmm' to figure out the bhyve virtual machines on the host. These tools break if the devmem device nodes also appear in /dev/vmm. Requested by: grehan
|
#
36e8356e |
|
21-Jun-2015 |
Neel Natu <neel@FreeBSD.org> |
Fix a regression in "movs" emulation after r284539. The regression was caused due to a change in behavior of the 'vm_map_gpa()'. Prior to r284539 if 'vm_map_gpa()' was called to map an address range in the guest MMIO region then it would return NULL. This was used by the "movs" emulation to detect if the 'src' or 'dst' operand was in MMIO space. Post r284539 'vm_map_gpa()' started returning a non-NULL pointer even when mapping the guest MMIO region. Fix this by returning non-NULL only if [gaddr, gaddr+len) is entirely within the 'lowmem' or 'highmem' regions and NULL otherwise. Pointy hat to: neel Reviewed by: grehan Reported by: tychon, Ben Perrault (ben.perrault@gmail.com) MFC after: 1 week
|
#
9b1aa8d6 |
|
18-Jun-2015 |
Neel Natu <neel@FreeBSD.org> |
Restructure memory allocation in bhyve to support "devmem". devmem is used to represent MMIO devices like the boot ROM or a VESA framebuffer where doing a trap-and-emulate for every access is impractical. devmem is a hybrid of system memory (sysmem) and emulated device models. devmem is mapped in the guest address space via nested page tables similar to sysmem. However the address range where devmem is mapped may be changed by the guest at runtime (e.g. by reprogramming a PCI BAR). Also devmem is usually mapped RO or RW as compared to RWX mappings for sysmem. Each devmem segment is named (e.g. "bootrom") and this name is used to create a device node for the devmem segment (e.g. /dev/vmm/testvm.bootrom). The device node supports mmap(2) and this decouples the host mapping of devmem from its mapping in the guest address space (which can change). Reviewed by: tychon Discussed with: grehan Differential Revision: https://reviews.freebsd.org/D2762 MFC after: 4 weeks
|
#
9c4d5478 |
|
06-May-2015 |
Neel Natu <neel@FreeBSD.org> |
Deprecate the 3-way return values from vm_gla2gpa() and vm_copy_setup(). Prior to this change both functions returned 0 for success, -1 for failure and +1 to indicate that an exception was injected into the guest. The numerical value of ERESTART also happens to be -1 so when these functions returned -1 it had to be translated to a positive errno value to prevent the VM_RUN ioctl from being inadvertently restarted. This made it easy to introduce bugs when writing emulation code. Fix this by adding an 'int *guest_fault' parameter and setting it to '1' if an exception was delivered to the guest. The return value is 0 or EFAULT so no additional translation is needed. Reviewed by: tychon MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D2428
|
#
ef7c2a82 |
|
31-Mar-2015 |
Tycho Nightingale <tychon@FreeBSD.org> |
Fix "MOVS" instruction memory to MMIO emulation. Currently updates to %rdi, %rsi, etc are inadvertently bypassed along with the check to see if the instruction needs to be repeated per the 'rep' prefix. Add "MOVS" instruction support for the 'MMIO to MMIO' case. Reviewed by: neel
|
#
009e2acb |
|
18-Jan-2015 |
Neel Natu <neel@FreeBSD.org> |
Fix a bug in libvmmapi 'vm_copy_setup()' where it would return success even if the 'gpa' was in the guest MMIO region. This would manifest as a segmentation fault in 'vm_map_copyin()' or 'vm_map_copyout()' because 'vm_map_gpa()' would return NULL for this 'gpa'. Fix this by calling 'vm_map_gpa()' in 'vm_copy_setup' and returning a failure if the 'gpa' cannot be mapped. This matches the behavior of 'vm_copy_setup()' in vmm.ko. MFC after: 1 week
|
#
d087a399 |
|
17-Jan-2015 |
Neel Natu <neel@FreeBSD.org> |
Simplify instruction restart logic in bhyve. Keep track of the next instruction to be executed by the vcpu as 'nextrip'. As a result the VM_RUN ioctl no longer takes the %rip where a vcpu should start execution. Also, instruction restart happens implicitly via 'vm_inject_exception()' or explicitly via 'vm_restart_instruction()'. The APIs behave identically in both kernel and userspace contexts. The main beneficiary is the instruction emulation code that executes in both contexts. bhyve(8) VM exit handlers now treat 'vmexit->rip' and 'vmexit->inst_length' as readonly: - Restarting an instruction is now done by calling 'vm_restart_instruction()' as opposed to setting 'vmexit->inst_length' to 0 (e.g. emulate_inout()) - Resuming vcpu at an arbitrary %rip is now done by setting VM_REG_GUEST_RIP as opposed to changing 'vmexit->rip' (e.g. vmexit_task_switch()) Differential Revision: https://reviews.freebsd.org/D1526 Reviewed by: grehan MFC after: 2 weeks
|
#
0dafa5cd |
|
30-Dec-2014 |
Neel Natu <neel@FreeBSD.org> |
Replace bhyve's minimal RTC emulation with a fully featured one in vmm.ko. The new RTC emulation supports all interrupt modes: periodic, update ended and alarm. It is also capable of maintaining the date/time and NVRAM contents across virtual machine reset. Also, the date/time fields can now be modified by the guest. Since bhyve now emulates both the PIT and the RTC there is no need for "Legacy Replacement Routing" in the HPET so get rid of it. The RTC device state can be inspected via bhyvectl as follows: bhyvectl --vm=vm --get-rtc-time bhyvectl --vm=vm --set-rtc-time=<unix_time_secs> bhyvectl --vm=vm --rtc-nvram-offset=<offset> --get-rtc-nvram bhyvectl --vm=vm --rtc-nvram-offset=<offset> --set-rtc-nvram=<value> Reviewed by: tychon Discussed with: grehan Differential Revision: https://reviews.freebsd.org/D1385 MFC after: 2 weeks
|
#
d37f2adb |
|
23-Jul-2014 |
Neel Natu <neel@FreeBSD.org> |
Fix fault injection in bhyve. The faulting instruction needs to be restarted when the exception handler is done handling the fault. bhyve now does this correctly by setting 'vmexit[vcpu].inst_length' to zero so the %rip is not advanced. A minor complication is that the fault injection APIs are used by instruction emulation code that is shared by vmm.ko and bhyve. Thus the argument that refers to 'struct vm *' in kernel or 'struct vmctx *' in userspace needs to be loosely typed as a 'void *'.
|
#
d665d229 |
|
22-Jul-2014 |
Neel Natu <neel@FreeBSD.org> |
Emulate instructions emitted by OpenBSD/i386 version 5.5: - CMP REG, r/m - MOV AX/EAX/RAX, moffset - MOV moffset, AX/EAX/RAX - PUSH r/m
|
#
091d4532 |
|
19-Jul-2014 |
Neel Natu <neel@FreeBSD.org> |
Handle nested exceptions in bhyve. A nested exception condition arises when a second exception is triggered while delivering the first exception. Most nested exceptions can be handled serially but some are converted into a double fault. If an exception is generated during delivery of a double fault then the virtual machine shuts down as a result of a triple fault. vm_exit_intinfo() is used to record that a VM-exit happened while an event was being delivered through the IDT. If an exception is triggered while handling the VM-exit it will be treated like a nested exception. vm_entry_intinfo() is used by processor-specific code to get the event to be injected into the guest on the next VM-entry. This function is responsible for deciding the disposition of nested exceptions.
|
#
be679db4 |
|
23-Jun-2014 |
Neel Natu <neel@FreeBSD.org> |
Provide APIs to directly get 'lowmem' and 'highmem' size directly. Previously the sizes were inferred indirectly based on the size of the mappings at 0 and 4GB respectively. This works fine as long as size of the allocation is identical to the size of the mapping in the guest's address space. However, if the mapping is disjoint then this assumption falls apart (e.g., due to the legacy BIOS hole between 640KB and 1MB).
|
#
5fcf252f |
|
07-Jun-2014 |
Neel Natu <neel@FreeBSD.org> |
Add ioctl(VM_REINIT) to reinitialize the virtual machine state maintained by vmm.ko. This allows the virtual machine to be restarted without having to destroy it first. Reviewed by: grehan
|
#
95ebc360 |
|
31-May-2014 |
Neel Natu <neel@FreeBSD.org> |
Activate vcpus from bhyve(8) using the ioctl VM_ACTIVATE_CPU instead of doing it implicitly in vmm.ko. Add ioctl VM_GET_CPUS to get the current set of 'active' and 'suspended' cpus and display them via /usr/sbin/bhyvectl using the "--get-active-cpus" and "--get-suspended-cpus" options. This is in preparation for being able to reset virtual machine state without having to destroy and recreate it.
|
#
6303b65d |
|
26-May-2014 |
Neel Natu <neel@FreeBSD.org> |
Fix issue with restarting an "insb/insw/insl" instruction because of a page fault on the destination buffer. Prior to this change a page fault would be detected in vm_copyout(). This was done after the I/O port access was done. If the I/O port access had side-effects (e.g. reading the uart FIFO) then restarting the instruction would result in incorrect behavior. Fix this by validating the guest linear address before doing the I/O port emulation. If the validation results in a page fault exception being injected into the guest then the instruction can now be restarted without any side-effects.
|
#
da11f4aa |
|
24-May-2014 |
Neel Natu <neel@FreeBSD.org> |
Add libvmmapi functions vm_copyin() and vm_copyout() to copy into and out of the guest linear address space. These APIs in turn use a new ioctl 'VM_GLA2GPA' to convert the guest linear address to guest physical. Use the new copyin/copyout APIs when emulating ins/outs instruction in bhyve(8).
|
#
b3e9732a |
|
15-May-2014 |
John Baldwin <jhb@FreeBSD.org> |
Implement a PCI interrupt router to route PCI legacy INTx interrupts to the legacy 8259A PICs. - Implement an ICH-comptabile PCI interrupt router on the lpc device with 8 steerable pins configured via config space access to byte-wide registers at 0x60-63 and 0x68-6b. - For each configured PCI INTx interrupt, route it to both an I/O APIC pin and a PCI interrupt router pin. When a PCI INTx interrupt is asserted, ensure that both pins are asserted. - Provide an initial routing of PCI interrupt router (PIRQ) pins to 8259A pins (ISA IRQs) and initialize the interrupt line config register for the corresponding PCI function with the ISA IRQ as this matches existing hardware. - Add a global _PIC method for OSPM to select the desired interrupt routing configuration. - Update the _PRT methods for PCI bridges to provide both APIC and legacy PRT tables and return the appropriate table based on the configured routing configuration. Note that if the lpc device is not configured, no routing information is provided. - When the lpc device is enabled, provide ACPI PCI link devices corresponding to each PIRQ pin. - Add a VMM ioctl to adjust the trigger mode (edge vs level) for 8259A pins via the ELCR. - Mark the power management SCI as level triggered. - Don't hardcode the number of elements in Packages in the source for the DSDT. iasl(8) will fill in the actual number of elements, and this makes it simpler to generate a Package with a variable number of elements. Reviewed by: tycho
|
#
0dd10c00 |
|
13-May-2014 |
Neel Natu <neel@FreeBSD.org> |
Don't include the guest memory segments in the bhyve(8) process core dump. This has not added a lot of value when debugging bhyve issues while greatly increasing the time and space required to store the core file. Passing the "-C" option to bhyve(8) will change the default and dump guest memory in the core dump. Requested by: grehan Reviewed by: grehan
|
#
f0fdcfe2 |
|
28-Apr-2014 |
Neel Natu <neel@FreeBSD.org> |
Allow a virtual machine to be forcibly reset or powered off. This is done by adding an argument to the VM_SUSPEND ioctl that specifies how the virtual machine should be suspended, viz. VM_SUSPEND_RESET or VM_SUSPEND_POWEROFF. The disposition of VM_SUSPEND is also made available to the exit handler via the 'u.suspended' member of 'struct vm_exit'. This capability is exposed via the '--force-reset' and '--force-poweroff' arguments to /usr/sbin/bhyvectl. Discussed with: grehan@
|
#
b15a09c0 |
|
26-Mar-2014 |
Neel Natu <neel@FreeBSD.org> |
Add an ioctl to suspend a virtual machine (VM_SUSPEND). The ioctl can be called from any context i.e., it is not required to be called from a vcpu thread. The ioctl simply sets a state variable 'vm->suspend' to '1' and returns. The vcpus inspect 'vm->suspend' in the run loop and if it is set to '1' the vcpu breaks out of the loop with a reason of 'VM_EXITCODE_SUSPENDED'. The suspend handler waits until all 'vm->active_cpus' have transitioned to 'vm->suspended_cpus' before returning to userspace. Discussed with: grehan
|
#
762fd208 |
|
11-Mar-2014 |
Tycho Nightingale <tychon@FreeBSD.org> |
Replace the userspace atpic stub with a more functional vmm.ko model. New ioctls VM_ISA_ASSERT_IRQ, VM_ISA_DEASSERT_IRQ and VM_ISA_PULSE_IRQ can be used to manipulate the pic, and optionally the ioapic, pin state. Reviewed by: jhb, neel Approved by: neel (co-mentor)
|
#
dc506506 |
|
25-Feb-2014 |
Neel Natu <neel@FreeBSD.org> |
Queue pending exceptions in the 'struct vcpu' instead of directly updating the processor-specific VMCS or VMCB. The pending exception will be delivered right before entering the guest. The order of event injection into the guest is: - hardware exception - NMI - maskable interrupt In the Intel VT-x case, a pending NMI or interrupt will enable the interrupt window-exiting and inject it as soon as possible after the hardware exception is injected. Also since interrupts are inherently asynchronous, injecting them after the hardware exception should not affect correctness from the guest perspective. Rename the unused ioctl VM_INJECT_EVENT to VM_INJECT_EXCEPTION and restrict it to only deliver x86 hardware exceptions. This new ioctl is now used to inject a protection fault when the guest accesses an unimplemented MSR. Discussed with: grehan, jhb Reviewed by: jhb
|
#
3cbf3585 |
|
29-Jan-2014 |
John Baldwin <jhb@FreeBSD.org> |
Enhance the support for PCI legacy INTx interrupts and enable them in the virtio backends. - Add a new ioctl to export the count of pins on the I/O APIC from vmm to the hypervisor. - Use pins on the I/O APIC >= 16 for PCI interrupts leaving 0-15 for ISA interrupts. - Populate the MP Table with I/O interrupt entries for any PCI INTx interrupts. - Create a _PRT table under the PCI root bridge in ACPI to route any PCI INTx interrupts appropriately. - Track which INTx interrupts are in use per-slot so that functions that share a slot attempt to distribute their INTx interrupts across the four available pins. - Implicitly mask INTx interrupts if either MSI or MSI-X is enabled and when the INTx DIS bit is set in a function's PCI command register. Either assert or deassert the associated I/O APIC pin when the state of one of those conditions changes. - Add INTx support to the virtio backends. - Always advertise the MSI capability in the virtio backends. Submitted by: neel (7) Reviewed by: neel MFC after: 2 weeks
|
#
330baf58 |
|
23-Dec-2013 |
John Baldwin <jhb@FreeBSD.org> |
Extend the support for local interrupts on the local APIC: - Add a generic routine to trigger an LVT interrupt that supports both fixed and NMI delivery modes. - Add an ioctl and bhyvectl command to trigger local interrupts inside a guest. In particular, a global NMI similar to that raised by SERR# or PERR# can be simulated by asserting LINT1 on all vCPUs. - Extend the LVT table in the vCPU local APIC to support CMCI. - Flesh out the local APIC error reporting a bit to cache errors and report them via ESR when ESR is written to. Add support for asserting the error LVT when an error occurs. Raise illegal vector errors when attempting to signal an invalid vector for an interrupt or when sending an IPI. - Ignore writes to reserved bits in LVT entries. - Export table entries the MADT and MP Table advertising the stock x86 config of LINT0 set to ExtInt and LINT1 wired to NMI. Reviewed by: neel (earlier version)
|
#
55888cfa |
|
17-Dec-2013 |
Neel Natu <neel@FreeBSD.org> |
Rename the ambiguously named 'vm_setup_msi()' and 'vm_setup_msix()' to 'vm_setup_pptdev_msi()' and 'vm_setup_pptdev_msix()' respectively. It should now be clear that these functions operate on passthru devices.
|
#
4f8be175 |
|
16-Dec-2013 |
Neel Natu <neel@FreeBSD.org> |
Add an API to deliver message signalled interrupts to vcpus. This allows callers treat the MSI 'addr' and 'data' fields as opaque and also lets bhyve implement multiple destination modes: physical, flat and clustered. Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com) Reviewed by: grehan@
|
#
08e3ff32 |
|
25-Nov-2013 |
Neel Natu <neel@FreeBSD.org> |
Add HPET device emulation to bhyve. bhyve supports a single timer block with 8 timers. The timers are all 32-bit and capable of being operated in periodic mode. All timers support interrupt delivery using MSI. Timers 0 and 1 also support legacy interrupt routing. At the moment the timers are not connected to any ioapic pins but that will be addressed in a subsequent commit. This change is based on a patch from Tycho Nightingale (tycho.nightingale@pluribusnetworks.com).
|
#
ac7304a7 |
|
22-Nov-2013 |
Neel Natu <neel@FreeBSD.org> |
Add an ioctl to assert and deassert an ioapic pin atomically. This will be used to inject edge triggered legacy interrupts into the guest. Start using the new API in device models that use edge triggered interrupts: viz. the 8254 timer and the LPC/uart device emulation. Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
|
#
565bbb86 |
|
12-Nov-2013 |
Neel Natu <neel@FreeBSD.org> |
Move the ioapic device model from userspace into vmm.ko. This is needed for upcoming in-kernel device emulations like the HPET. The ioctls VM_IOAPIC_ASSERT_IRQ and VM_IOAPIC_DEASSERT_IRQ are used to manipulate the ioapic pin state. Discussed with: grehan@ Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
|
#
49cc03da |
|
16-Oct-2013 |
Neel Natu <neel@FreeBSD.org> |
Add a new capability, VM_CAP_ENABLE_INVPCID, that can be enabled to expose 'invpcid' instruction to the guest. Currently bhyve will try to enable this capability unconditionally if it is available. Consolidate code in bhyve to set the capabilities so it is no longer duplicated in BSP and AP bringup. Add a sysctl 'vm.pmap.invpcid_works' to display whether the 'invpcid' instruction is available. Reviewed by: grehan MFC after: 3 days
|
#
200758f1 |
|
08-Oct-2013 |
Neel Natu <neel@FreeBSD.org> |
Parse the memory size parameter using expand_number() to allow specifying the memory size more intuitively (e.g. 512M, 4G etc). Submitted by: rodrigc Reviewed by: grehan Approved by: re (blanket)
|
#
318224bb |
|
05-Oct-2013 |
Neel Natu <neel@FreeBSD.org> |
Merge projects/bhyve_npt_pmap into head. Make the amd64/pmap code aware of nested page table mappings used by bhyve guests. This allows bhyve to associate each guest with its own vmspace and deal with nested page faults in the context of that vmspace. This also enables features like accessed/dirty bit tracking, swapping to disk and transparent superpage promotions of guest memory. Guest vmspace: Each bhyve guest has a unique vmspace to represent the physical memory allocated to the guest. Each memory segment allocated by the guest is mapped into the guest's address space via the 'vmspace->vm_map' and is backed by an object of type OBJT_DEFAULT. pmap types: The amd64/pmap now understands two types of pmaps: PT_X86 and PT_EPT. The PT_X86 pmap type is used by the vmspace associated with the host kernel as well as user processes executing on the host. The PT_EPT pmap is used by the vmspace associated with a bhyve guest. Page Table Entries: The EPT page table entries as mostly similar in functionality to regular page table entries although there are some differences in terms of what bits are used to express that functionality. For e.g. the dirty bit is represented by bit 9 in the nested PTE as opposed to bit 6 in the regular x86 PTE. Therefore the bitmask representing the dirty bit is now computed at runtime based on the type of the pmap. Thus PG_M that was previously a macro now becomes a local variable that is initialized at runtime using 'pmap_modified_bit(pmap)'. An additional wrinkle associated with EPT mappings is that older Intel processors don't have hardware support for tracking accessed/dirty bits in the PTE. This means that the amd64/pmap code needs to emulate these bits to provide proper accounting to the VM subsystem. This is achieved by using the following mapping for EPT entries that need emulation of A/D bits: Bit Position Interpreted By PG_V 52 software (accessed bit emulation handler) PG_RW 53 software (dirty bit emulation handler) PG_A 0 hardware (aka EPT_PG_RD) PG_M 1 hardware (aka EPT_PG_WR) The idea to use the mapping listed above for A/D bit emulation came from Alan Cox (alc@). The final difference with respect to x86 PTEs is that some EPT implementations do not support superpage mappings. This is recorded in the 'pm_flags' field of the pmap. TLB invalidation: The amd64/pmap code has a number of ways to do invalidation of mappings that may be cached in the TLB: single page, multiple pages in a range or the entire TLB. All of these funnel into a single EPT invalidation routine called 'pmap_invalidate_ept()'. This routine bumps up the EPT generation number and sends an IPI to the host cpus that are executing the guest's vcpus. On a subsequent entry into the guest it will detect that the EPT has changed and invalidate the mappings from the TLB. Guest memory access: Since the guest memory is no longer wired we need to hold the host physical page that backs the guest physical page before we can access it. The helper functions 'vm_gpa_hold()/vm_gpa_release()' are available for this purpose. PCI passthru: Guest's with PCI passthru devices will wire the entire guest physical address space. The MMIO BAR associated with the passthru device is backed by a vm_object of type OBJT_SG. An IOMMU domain is created only for guest's that have one or more PCI passthru devices attached to them. Limitations: There isn't a way to map a guest physical page without execute permissions. This is because the amd64/pmap code interprets the guest physical mappings as user mappings since they are numerically below VM_MAXUSER_ADDRESS. Since PG_U shares the same bit position as EPT_PG_EXECUTE all guest mappings become automatically executable. Thanks to Alan Cox and Konstantin Belousov for their rigorous code reviews as well as their support and encouragement. Thanks for John Baldwin for reviewing the use of OBJT_SG as the backing object for pci passthru mmio regions. Special thanks to Peter Holm for testing the patch on short notice. Approved by: re Discussed with: grehan Reviewed by: alc, kib Tested by: pho
|
#
ade4af66 |
|
25-Apr-2013 |
Neel Natu <neel@FreeBSD.org> |
Remove deprecated APIs to get the total and free memory available to vmm.ko. These APIs were relevant when memory for virtual machine allocation was hard partitioned away from the rest of the system but that is no longer the case. The sysctls that provided this information were garbage collected a while back. Obtained from: NetApp
|
#
b060ba50 |
|
18-Mar-2013 |
Neel Natu <neel@FreeBSD.org> |
Simplify the assignment of memory to virtual machines by requiring a single command line option "-m <memsize in MB>" to specify the memory size. Prior to this change the user needed to explicitly specify the amount of memory allocated below 4G (-m <lowmem>) and the amount above 4G (-M <highmem>). The "-M" option is no longer supported by 'bhyveload' and 'bhyve'. The start of the PCI hole is fixed at 3GB and cannot be directly changed using command line options. However it is still possible to change this in special circumstances via the 'vm_set_lowmem_limit()' API provided by libvmmapi. Submitted by: Dinakar Medavaram (initial version) Reviewed by: grehan Obtained from: NetApp
|
#
485b3300 |
|
11-Feb-2013 |
Neel Natu <neel@FreeBSD.org> |
Implement guest vcpu pinning using 'pthread_setaffinity_np(3)'. Prior to this change pinning was implemented via an ioctl (VM_SET_PINNING) that called 'sched_bind()' on behalf of the user thread. The ULE implementation of 'sched_bind()' bumps up 'td_pinned' which in turn runs afoul of the assertion '(td_pinned == 0)' in userret(). Using the cpuset affinity to implement pinning of the vcpu threads works with both 4BSD and ULE schedulers and has the happy side-effect of getting rid of a bunch of code in vmm.ko. Discussed with: grehan
|
#
fbfc1c76 |
|
26-Oct-2012 |
Peter Grehan <grehan@FreeBSD.org> |
Remove mptable generation code from libvmmapi and move it to bhyve. Firmware tables require too much knowledge of system configuration, and it's difficult to pass that information in general terms to a library. The upcoming ACPI work exposed this - it will also livein bhyve. Also, remove code specific to NetApp from the mptable name, and remove the -n option from bhyve. Reviewed by: neel Obtained from: NetApp
|
#
13f4cf6c |
|
12-Oct-2012 |
Neel Natu <neel@FreeBSD.org> |
Add an api to map a vm capability type into a string to be used for display purposes.
|
#
4b52a1e4 |
|
03-Oct-2012 |
Neel Natu <neel@FreeBSD.org> |
The ioctl VM_GET_MEMORY_SEG is no longer able to return the host physical address associated with the guest memory segment. This is because there is no longer a 1:1 mapping between GPA and HPA. As a result 'vmmctl' can only display the guest physical address and the length of the lowmem and highmem segments.
|
#
f7d51510 |
|
03-Oct-2012 |
Neel Natu <neel@FreeBSD.org> |
Change vm_malloc() to map pages in the guest physical address space in 4KB chunks. This breaks the assumption that the entire memory segment is contiguously allocated in the host physical address space. This also paves the way to satisfy the 4KB page allocations by requesting free pages from the VM subsystem as opposed to hard-partitioning host memory at boot time.
|
#
e9027382 |
|
25-Sep-2012 |
Neel Natu <neel@FreeBSD.org> |
Add ioctls to control the X2APIC capability exposed by the virtual machine to the guest. At the moment this simply sets the state in the 'vcpu' instance but there is no code that acts upon these settings.
|
#
177fd533 |
|
25-Aug-2012 |
Peter Grehan <grehan@FreeBSD.org> |
Add sysctls to display the total and free amount of hard-wired mem for VMs # sysctl hw.vmm hw.vmm.mem_free: 2145386496 hw.vmm.mem_total: 2145386496 Submitted by: Takeshi HASEGAWA hasegaw at gmail com
|
#
1ff856db |
|
04-Aug-2012 |
Neel Natu <neel@FreeBSD.org> |
Allow the 'bhyve' process to control whether or not the virtual machine sees an ioapic. Obtained from: NetApp
|
#
90d4b48f |
|
03-Aug-2012 |
Neel Natu <neel@FreeBSD.org> |
API to map an apic id to the vcpu. At the moment this is a simple mapping because the numerical values are identical.
|
#
32c96bc8 |
|
03-Aug-2012 |
Neel Natu <neel@FreeBSD.org> |
There is no need to explicitly specify the CR4_VMXE bit when writing to guest CR4. This bit is specific to the Intel VTX and removing it makes the library more portable to AMD/SVM. In the Intel VTX implementation, the hypervisor will ensure that this bit is always set. See vmx_fix_cr4() for details. Suggested by: grehan
|
#
cd942e0f |
|
28-Apr-2012 |
Peter Grehan <grehan@FreeBSD.org> |
MSI-x interrupt support for PCI pass-thru devices. Includes instruction emulation for memory r/w access. This opens the door for io-apic, local apic, hpet timer, and legacy device emulation. Submitted by: ryan dot berryhill at sandvine dot com Reviewed by: grehan Obtained from: Sandvine
|
#
366f6083 |
|
12-May-2011 |
Peter Grehan <grehan@FreeBSD.org> |
Import of bhyve hypervisor and utilities, part 1. vmm.ko - kernel module for VT-x, VT-d and hypervisor control bhyve - user-space sequencer and i/o emulation vmmctl - dump of hypervisor register state libvmm - front-end to vmm.ko chardev interface bhyve was designed and implemented by Neel Natu. Thanks to the following folk from NetApp who helped to make this available: Joe CaraDonna Peter Snyder Jeff Heller Sandeep Mann Steve Miller Brian Pawlowski
|