History log of /freebsd-current/usr.sbin/bhyve/block_if.h
Revision Date Author Comments
# b3e76948 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# 480bef94 16-Aug-2021 Corvin Köhne <corvink@FreeBSD.org>

bhyve: add bootindex option for several devices

The bootindex option creates an entry in the "bootorder" fwcfg file.
This file can be picked up by the guest firmware to determine the
bootorder. Nevertheless, it's not guaranteed that the guest firmware
uses the bootorder. At the moment, our OVMF ignores the bootorder. This
will change in the future.

If guest firmware supports the "bootorder" fwcfg file and no device uses
the bootindex option, the boot order is determined by the firmware
itself. If one or more devices specify a bootindex, the first bootable
device with the lowest bootindex will be booted. It's not garanteed that
devices without a bootindex will be recognized as bootable from the
firmware in that case.

Reviewed by: jhb
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39285


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# cd9618bd 23-Jun-2022 Vitaliy Gusev <gusev.vitaliy@gmail.com>

bhyve: Snapshot impovements for 'blockif' backend

When pausing a block I/O device model as part of suspending a VM, wait
for all active block I/O requests to finish before saving snapshot
data. This avoids having to save information about in-flight requests
both in the block_if layer and in storage device models.

For the AHCI device model, the queues are now guaranteed to be idle
when taking a snapshot, so remove the code to save queue state and
rely on the initial state in a resumed VM having all queues already
idle.

This will also simplify adding NVMe snapshot support in the future.

Reviewed by: jhb
Sponsored by: vStack
Differential Revision: https://reviews.freebsd.org/D26267


# 8794846a 11-Jun-2021 John Baldwin <jhb@FreeBSD.org>

bhyve: Add support for handling disk resize events to block_if.

Allow clients of blockif to register a resize callback handler. When
a callback is registered, register an EVFILT_VNODE kevent watching the
backing store for a change in the file's attributes. If the size has
changed when the kevent fires, invoke the clients' callback.

Currently resize detection is limited to backing stores that support
EVFILT_VNODE kevents such as regular files.

Reviewed by: grehan, markj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D30504


# 621b5090 26-Jun-2019 John Baldwin <jhb@FreeBSD.org>

Refactor configuration management in bhyve.

Replace the existing ad-hoc configuration via various global variables
with a small database of key-value pairs. The database supports
heirarchical keys using a MIB-like syntax to name the path to a given
key. Values are always stored as strings. The API used to manage
configuation values does include wrappers to handling boolean values.
Other values use non-string types require parsing by consumers.

The configuration values are stored in a tree using nvlists. Leaf
nodes hold string values. Configuration values are permitted to
reference other configuration values using '%(name)'. This permits
constructing template configurations.

All existing command line arguments now set configuration values. For
devices, the "-s" option parses its option argument to generate a list
of key-value pairs for the given device.

A new '-o' command line option permits setting an individual
configuration variable. The key name is always given as a full path
of dot-separated components.

A new '-k' command line option parses a simple configuration file.
This configuration file holds a flat list of 'key=value' lines where
the 'key' is the full path of a configuration variable. Lines
starting with a '#' are comments.

In general, bhyve starts by parsing command line options in sequence
and applying those settings to configuration values. Once this is
complete, bhyve then begins initializing its state based on the
configuration values. This means that subsequent configuration
options or files may override or supplement previously given settings.

A special 'config.dump' configuration value can be set to true to help
debug configuration issues. When this value is set, bhyve will print
out the configuration variables as a flat list of 'key=value' lines.

Most command line argments map to a single configuration variable,
e.g. '-w' sets the 'x86.strictmsr' value to false. A few command
line arguments have less obvious effects:

- Multiple '-p' options append their values (as a comma-seperated
list) to "vcpu.N.cpuset" values (where N is a decimal vcpu number).

- For '-s' options, a pci.<bus>.<slot>.<function> node is created.
The first argument to '-s' (the device type) is used as the value of
a "device" variable. Additional comma-separated arguments are then
parsed into 'key=value' pairs and used to set additional variables
under the device node. A PCI device emulation driver can provide
its own hook to override the parsing of the additonal '-s' arguments
after the device type.

After the configuration phase as completed, the init_pci hook
then walks the "pci.<bus>.<slot>.<func>" nodes. It uses the
"device" value to find the device model to use. The device
model's init routine is passed a reference to its nvlist node
in the configuration tree which it can query for specific
variables.

The result is that a lot of the string parsing is removed from
the device models and centralized. In addition, adding a new
variable just requires teaching the model to look for the new
variable.

- For '-l' options, a similar model is used where the string is
parsed into values that are later read during initialization.
One key note here is that the serial ports use the commonly
used lowercase names from existing documentation and examples
(e.g. "lpc.com1") instead of the uppercase names previously
used internally in bhyve.

Reviewed by: grehan
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D26035


# 483d953a 04-May-2020 John Baldwin <jhb@FreeBSD.org>

Initial support for bhyve save and restore.

Save and restore (also known as suspend and resume) permits a snapshot
to be taken of a guest's state that can later be resumed. In the
current implementation, bhyve(8) creates a UNIX domain socket that is
used by bhyvectl(8) to send a request to save a snapshot (and
optionally exit after the snapshot has been taken). A snapshot
currently consists of two files: the first holds a copy of guest RAM,
and the second file holds other guest state such as vCPU register
values and device model state.

To resume a guest, bhyve(8) must be started with a matching pair of
command line arguments to instantiate the same set of device models as
well as a pointer to the saved snapshot.

While the current implementation is useful for several uses cases, it
has a few limitations. The file format for saving the guest state is
tied to the ABI of internal bhyve structures and is not
self-describing (in that it does not communicate the set of device
models present in the system). In addition, the state saved for some
device models closely matches the internal data structures which might
prove a challenge for compatibility of snapshot files across a range
of bhyve versions. The file format also does not currently support
versioning of individual chunks of state. As a result, the current
file format is not a fixed binary format and future revisions to save
and restore will break binary compatiblity of snapshot files. The
goal is to move to a more flexible format that adds versioning,
etc. and at that point to commit to providing a reasonable level of
compatibility. As a result, the current implementation is not enabled
by default. It can be enabled via the WITH_BHYVE_SNAPSHOT=yes option
for userland builds, and the kernel option BHYVE_SHAPSHOT.

Submitted by: Mihai Tiganus, Flavius Anton, Darius Mihai
Submitted by: Elena Mihailescu, Mihai Carabas, Sergiu Weisz
Relnotes: yes
Sponsored by: University Politehnica of Bucharest
Sponsored by: Matthew Grooms (student scholarships)
Sponsored by: iXsystems
Differential Revision: https://reviews.freebsd.org/D19495


# 8c74ade8 02-May-2019 John Baldwin <jhb@FreeBSD.org>

Increase the VirtIO segment count to support modern Windows guests.

The Windows virtio driver ignores the advertized seg_max field and
assumes the host can accept up to 67 segments in indirect descriptors,
triggering an assert in the bhyve process.

This brings back r282922 but with a couple of changes:
- It raises the block interface segment limit to 128 instead of 67.
- Linux's virtio driver assumes that the segment limit is no
larger than the ring size. To avoid breaking Linux guests,
raise the VirtIO ring size to 128, and cap the VirtIO segment
limit at ring size - 2 (effectively 126).

Reviewed by: rgrimes, Patrick Mooney <pmooney@pfmooney.com>
Obtained from: Joyent (Linux workaround)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18831


# c066c68c 04-Jul-2018 Marcelo Araujo <araujo@FreeBSD.org>

- Add bhyve NVMe device emulation.

The initial work on bhyve NVMe device emulation was done by the GSoC student
Shunsuke Mie and was heavily modified in performan, functionality and
guest support by Leon Dang.

bhyve:
-s <n>,nvme,devpath,maxq=#,qsz=#,ioslots=#,sectsz=#,ser=A-Z

accepted devpath:
/dev/blockdev
/path/to/image
ram=size_in_MiB

Tested with guest OS: FreeBSD Head, Linux Fedora fc27, Ubuntu 18.04,
OpenSuse 15.0, Windows Server 2016 Datacenter.
Tested with all accepted device paths: Real nvme, zdev and also with ram.
Tested on: AMD Ryzen Threadripper 1950X 16-Core Processor and
Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz.

Tests at: https://people.freebsd.org/~araujo/bhyve_nvme/nvme.txt

Submitted by: Shunsuke Mie <sux2mfgj_gmail.com>,
Leon Dang <leon_digitalmsx.com>
Reviewed by: chuck (early version), grehan
Relnotes: Yes
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D14022


# 1de7b4b8 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

various: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# fd198814 20-May-2015 Peter Grehan <grehan@FreeBSD.org>

Temporarily revert r282922 which bumped the max descriptors.

While there is no issued with the number of descriptors in
a virtio indirect descriptor, it's a guest's choice as to
whether indirect descriptors are used. For the case where
they aren't, the virtio block ring size is still 64 which
is less than the now reported max_segs of 67. This results
in an assertion in recent Linux guests even though it was
benign since they were using indirect descs.

The intertwined relationship between virtio ring size,
max seg size and blockif queue size will be addressed
in an upcoming commit, at which point the max descriptors
will again be bumped up to 67.


# 253396a3 14-May-2015 Peter Grehan <grehan@FreeBSD.org>

Bump the size of the blockif scatter-gather list to 67.

The Windows virtio driver ignores the advertized seg_max
field and assumes the host can accept up to 67 segments
in indirect descriptors, triggering an assert in the bhyve
process.

No objection from: mav
Reviewed by: neel
Reported and tested by: Leon Dang (ldang@nahannisys.com)
MFC after: 2 weeks


# bb1524af 18-Apr-2015 Alexander Motin <mav@FreeBSD.org>

Workaround bhyve virtual disks operation on top of GEOM providers.

GEOM does not support scatter/gather lists in its I/Os. Such requests
are cut in pieces by physio(), that may be problematic, if those pieces
are not multiple of provider's sector size. If such case is detected,
move the data through temporary sequential buffer.

MFC after: 2 weeks


# 54b7bb76 16-Mar-2015 Alexander Motin <mav@FreeBSD.org>

Increase S/G list size of 32 to 33 entries.

32 entries are not enough for the worst case of misaligned 128KB request,
that made FreeBSD to chunk large quests in odd pieces.

MFC after: 2 weeks


# 0b9d25c9 13-Mar-2015 Alexander Motin <mav@FreeBSD.org>

Add DSM TRIM command support for virtual AHCI disks.

It works only for virtual disks backed by ZVOLs and raw devices supporting
BIO_DELETE. Virtual disks backed by files won't report this capability.

MFC after: 2 weeks
Relnotes: yes


# 94682383 04-Mar-2015 Alexander Motin <mav@FreeBSD.org>

Report logical/physical sector sizes for virtual SATA disk.

MFC after: 2 weeks


# c4813fad 14-Jul-2014 Peter Grehan <grehan@FreeBSD.org>

Add a call to synthesize a C/H/S value for block emulations
that require it (ahci). The algorithm used is from the VHD
specification.


# 7cf5a7ee 04-Oct-2013 Peter Grehan <grehan@FreeBSD.org>

Block-layer backend interface for bhyve block-io device emulations.

Approved by: re@ (blanket)