#
0bda8d3e |
| 07-Sep-2022 |
Corvin Köhne <CorvinK@beckhoff.com> |
vmm: permit some IPIs to be handled by userspace
Add VM_EXITCODE_IPI to permit returning unhandled IPIs to userland. INIT and STARTUP IPIs are now returned to userland. Due to backward compatibility
vmm: permit some IPIs to be handled by userspace
Add VM_EXITCODE_IPI to permit returning unhandled IPIs to userland. INIT and STARTUP IPIs are now returned to userland. Due to backward compatibility reasons, a new capability is added for enabling VM_EXITCODE_IPI.
Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D35623 Sponsored by: Beckhoff Automation GmbH & Co. KG
show more ...
|
#
c6d31b83 |
| 18-Jul-2022 |
Konstantin Belousov <kib@FreeBSD.org> |
AST: rework
Make most AST handlers dynamically registered. This allows to have subsystem-specific handler source located in the subsystem files, instead of making subr_trap.c aware of it. For inst
AST: rework
Make most AST handlers dynamically registered. This allows to have subsystem-specific handler source located in the subsystem files, instead of making subr_trap.c aware of it. For instance, signal delivery code on return to userspace is now moved to kern_sig.c.
Also, it allows to have some handlers designated as the cleanup (kclear) type, which are called both at AST and on thread/process exit. For instance, ast(), exit1(), and NFS server no longer need to be aware about UFS softdep processing.
The dynamic registration also allows third-party modules to register AST handlers if needed. There is one caveat with loadable modules: the code does not make any effort to ensure that the module is not unloaded before all threads processed through AST handler in it. In fact, this is already present behavior for hwpmc.ko and ufs.ko. I do not think it is worth the efforts and the runtime overhead to try to fix it.
Reviewed by: markj Tested by: emaste (arm64), pho Discussed with: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D35888
show more ...
|
#
6b389740 |
| 12-Jul-2022 |
Mark Johnston <markj@FreeBSD.org> |
vm_object: Modify various drivers to allocate OBJT_SWAP objects
This is in preparation for removal of OBJT_DEFAULT. In particular, it is now cheap to check whether an OBJT_SWAP object has any swap
vm_object: Modify various drivers to allocate OBJT_SWAP objects
This is in preparation for removal of OBJT_DEFAULT. In particular, it is now cheap to check whether an OBJT_SWAP object has any swap blocks allocated, so the benefit of having a separate OBJT_DEFAULT type is quite marginal, and the OBJT_DEFAULT->SWAP transition is a source of bugs.
Reviewed by: alc, hselasky, kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D35779
show more ...
|
#
e7d34aed |
| 09-Jun-2022 |
Vitaliy Gusev <gusev.vitaliy@gmail.com> |
vmm: move bumping VMEXIT_USERSPACE stat to the right place
Statistic for "number of vm exits handled in userspace" should be increased in vm_run() instead of vmx_run() because in some cases vm_run()
vmm: move bumping VMEXIT_USERSPACE stat to the right place
Statistic for "number of vm exits handled in userspace" should be increased in vm_run() instead of vmx_run() because in some cases vm_run() doesn't exit to userspace and keeps entering the guest.
Also svm_run's implementation even wrongly misses that stat.
Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D35350
show more ...
|
#
3ba952e1 |
| 30-May-2022 |
Corvin Köhne <CorvinK@beckhoff.com> |
vmm: add tunable to trap WBINVD
x86 is cache coherent. However, there are special cases where cache coherency isn't ensured (e.g. when switching the caching mode). In these cases, WBINVD can be used
vmm: add tunable to trap WBINVD
x86 is cache coherent. However, there are special cases where cache coherency isn't ensured (e.g. when switching the caching mode). In these cases, WBINVD can be used. WBINVD writes all cache lines back into main memory and invalidates the whole cache.
Due to the invalidation of the whole cache, WBINVD is a very heavy instruction and degrades the performance on all cores. So, we should minimize the use of WBINVD as much as possible.
In a virtual environment, the WBINVD call is mostly useless. The guest isn't able to break cache coherency because he can't switch the physical cache mode. When using pci passthrough WBINVD might be useful.
Nevertheless, trapping and ignoring WBINVD is an unsafe operation. For that reason, we implement it as tunable.
Reviewed by: jhb Sponsored by: Beckhoff Automation GmbH & Co. KG MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D35253
show more ...
|
Revision tags: release/13.1.0 |
|
#
246c3981 |
| 18-Mar-2022 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
bhyve: Do not remove guest physical addresses from IOMMU host domain
This permits I/O devices on the host to directly access wired memory dedicated to guests using passthru devices. Note that wired
bhyve: Do not remove guest physical addresses from IOMMU host domain
This permits I/O devices on the host to directly access wired memory dedicated to guests using passthru devices. Note that wired memory belonging to guests that do not use passthru devices has always been accessible by I/O devices on the host.
bhyve maps guest physical addresses into the user address space of the bhyve process by mmap'ing /dev/vmm/<vmname>. Device models pass pointers derived from this mapping directly to system calls such as preadv() to minimize copies when emulating DMA. If the backing store for a device model is a raw host device (e.g. when exporting a raw disk device such as /dev/ada<n> as a drive in the guest), the host device driver (e.g. ahci for /dev/ada<n>) can itself use DMA on the host directly to the guest's memory. However, if the guest's memory is not present in the host IOMMU domain, these DMA requests by the host device will fail without raising an error visible to the host device driver or to the guest resulting in non-working I/O in the guest.
It is unclear why guest addresses were removed from the IOMMU host domain initially, especially only for VM's with a passthru device as the host IOMMU domain does not affect the permissions of passthru devices, only devices on the host.
A considered alternative was using bounce buffers instead (D34535 is a proof of concept), but that adds additional overhead for unclear benefit.
This solves a long-standing problem when using passthru devices and physical disks in the same VM.
Thanks to: grehan (patience and help) Thanks to: jhb (for improving the commit message) PR: 260178 Reviewed by: grehan, jhb MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D34607
show more ...
|
#
e47fe318 |
| 10-Mar-2022 |
Corvin Köhne <CorvinK@beckhoff.com> |
bhyve: add ROM emulation
Some PCI devices especially GPUs require a ROM to work properly. The ROM is executed by boot firmware to initialize the device. To add a ROM to a device use the new ROM opti
bhyve: add ROM emulation
Some PCI devices especially GPUs require a ROM to work properly. The ROM is executed by boot firmware to initialize the device. To add a ROM to a device use the new ROM option for passthru device (e.g. -s passthru,0/2/0,rom=<path>/<to>/<rom>).
It's necessary that the ROM is executed by the boot firmware. It won't be executed by any OS. Additionally, the boot firmware should be configured to execute the ROM file. For that reason, it's only possible to use a ROM when using OVMF with enabled bus enumeration.
Differential Revision: https://reviews.freebsd.org/D33129 Sponsored by: Beckhoff Automation GmbH & Co. KG MFC after: 1 month
show more ...
|
#
73505a10 |
| 01-Mar-2022 |
Robert Wing <rew@FreeBSD.org> |
vmm: fix "set but not used" warnings
|
#
e2650af1 |
| 30-Dec-2021 |
Stefan Eßer <se@FreeBSD.org> |
Make CPU_SET macros compliant with other implementations
The introduction of <sched.h> improved compatibility with some 3rd party software, but caused the configure scripts of some ports to assume t
Make CPU_SET macros compliant with other implementations
The introduction of <sched.h> improved compatibility with some 3rd party software, but caused the configure scripts of some ports to assume that they were run in a GLIBC compatible environment.
Parts of sched.h were made conditional on -D_WITH_CPU_SET_T being added to ports, but there still were compatibility issues due to invalid assumptions made in autoconfigure scripts.
The differences between the FreeBSD version of macros like CPU_AND, CPU_OR, etc. and the GLIBC versions was in the number of arguments: FreeBSD used a 2-address scheme (one source argument is also used as the destination of the operation), while GLIBC uses a 3-adderess scheme (2 source operands and a separately passed destination).
The GLIBC scheme provides a super-set of the functionality of the FreeBSD macros, since it does not prevent passing the same variable as source and destination arguments. In code that wanted to preserve both source arguments, the FreeBSD macros required a temporary copy of one of the source arguments.
This patch set allows to unconditionally provide functions and macros expected by 3rd party software written for GLIBC based systems, but breaks builds of externally maintained sources that use any of the following macros: CPU_AND, CPU_ANDNOT, CPU_OR, CPU_XOR.
One contributed driver (contrib/ofed/libmlx5) has been patched to support both the old and the new CPU_OR signatures. If this commit is merged to -STABLE, the version test will have to be extended to cover more ranges.
Ports that have added -D_WITH_CPU_SET_T to build on -CURRENT do no longer require that option.
The FreeBSD version has been bumped to 1400046 to reflect this incompatible change.
Reviewed by: kib MFC after: 2 weeks Relnotes: yes Differential Revision: https://reviews.freebsd.org/D33451
show more ...
|
Revision tags: release/12.3.0 |
|
#
df95cc76 |
| 02-Aug-2021 |
Ka Ho Ng <khng@FreeBSD.org> |
vmm: Bump vmname buffer in struct vm to VM_MAX_NAMELEN + 1
In hw.vmm.create sysctl handler the maximum length of vm name is VM_MAX_NAMELEN. However in vm_create() the maximum length allowed is only
vmm: Bump vmname buffer in struct vm to VM_MAX_NAMELEN + 1
In hw.vmm.create sysctl handler the maximum length of vm name is VM_MAX_NAMELEN. However in vm_create() the maximum length allowed is only VM_MAX_NAMELEN - 1 chars. Bump the length of the internal buffer to allow the length of VM_MAX_NAMELEN for vm name.
MFC after: 3 days Reviewed by: grehan Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31372
show more ...
|
Revision tags: release/13.0.0 |
|
#
f8a6ec2d |
| 18-Mar-2021 |
D Scott Phillips <scottph@FreeBSD.org> |
bhyve: support relocating fbuf and passthru data BARs
We want to allow the UEFI firmware to enumerate and assign addresses to PCI devices so we can boot from NVMe[1]. Address assignment of PCI BARs
bhyve: support relocating fbuf and passthru data BARs
We want to allow the UEFI firmware to enumerate and assign addresses to PCI devices so we can boot from NVMe[1]. Address assignment of PCI BARs is properly handled by the PCI emulation code in general, but a few specific cases need additional support. fbuf and passthru map additional objects into the guest physical address space and so need to handle address updates. Here we add a callback to emulated PCI devices to inform them of a BAR configuration change. fbuf and passthru then watch for these BAR changes and relocate the frame buffer memory segment and passthru device mmio area respectively.
We also add new VM_MUNMAP_MEMSEG and VM_UNMAP_PPTDEV_MMIO ioctls to vmm(4) to facilitate the unmapping needed for addres updates.
[1]: https://github.com/freebsd/uefi-edk2/pull/9/
Originally by: scottph MFC After: 1 week Sponsored by: Intel Corporation Reviewed by: grehan Approved by: philip (mentor) Differential Revision: https://reviews.freebsd.org/D24066
show more ...
|
#
3c48106a |
| 29-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
bhyve: limit max GPA to VM_MAXUSER_ADDRESS_LA48.
We use 4-level EPT pages, correct the upper bound.
Reviewed by: grehan Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.f
bhyve: limit max GPA to VM_MAXUSER_ADDRESS_LA48.
We use 4-level EPT pages, correct the upper bound.
Reviewed by: grehan Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27402
show more ...
|
#
15add60d |
| 28-Nov-2020 |
Peter Grehan <grehan@FreeBSD.org> |
Convert vmm_ops calls to IFUNC
There is no need for these to be function pointers since they are never modified post-module load.
Rename AMD/Intel ops to be more consistent.
Submitted by: adam_fen
Convert vmm_ops calls to IFUNC
There is no need for these to be function pointers since they are never modified post-module load.
Rename AMD/Intel ops to be more consistent.
Submitted by: adam_fenn.io Reviewed by: markj, grehan Approved by: grehan (bhyve) MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D27375
show more ...
|
Revision tags: release/12.2.0 |
|
#
543769bf |
| 01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
amd64: clean up empty lines in .c and .h files
|
#
e2515283 |
| 27-Aug-2020 |
Glen Barber <gjb@FreeBSD.org> |
MFH
Sponsored by: Rubicon Communications, LLC (netgate.com)
|
#
46567b4f |
| 18-Aug-2020 |
Peter Grehan <grehan@FreeBSD.org> |
Allow guest device MMIO access from bootmem memory segments.
Recent versions of UEFI have moved local APIC timer initialization into the early SEC phase which runs out of ROM, prior to self-relocati
Allow guest device MMIO access from bootmem memory segments.
Recent versions of UEFI have moved local APIC timer initialization into the early SEC phase which runs out of ROM, prior to self-relocating into RAM. This results in a hypervisor exit.
Currently bhyve prevents instruction emulation from segments that aren't marked as "sysmem" aka guest RAM, with the vm_gpa_hold() routine failing. However, there is no reason for this restriction: the hypervisor already controls whether EPT mappings are marked as executable.
Fix by dropping the redundant check of sysmem.
MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D25955
show more ...
|
Revision tags: release/11.4.0 |
|
#
483d953a |
| 05-May-2020 |
John Baldwin <jhb@FreeBSD.org> |
Initial support for bhyve save and restore.
Save and restore (also known as suspend and resume) permits a snapshot to be taken of a guest's state that can later be resumed. In the current implement
Initial support for bhyve save and restore.
Save and restore (also known as suspend and resume) permits a snapshot to be taken of a guest's state that can later be resumed. In the current implementation, bhyve(8) creates a UNIX domain socket that is used by bhyvectl(8) to send a request to save a snapshot (and optionally exit after the snapshot has been taken). A snapshot currently consists of two files: the first holds a copy of guest RAM, and the second file holds other guest state such as vCPU register values and device model state.
To resume a guest, bhyve(8) must be started with a matching pair of command line arguments to instantiate the same set of device models as well as a pointer to the saved snapshot.
While the current implementation is useful for several uses cases, it has a few limitations. The file format for saving the guest state is tied to the ABI of internal bhyve structures and is not self-describing (in that it does not communicate the set of device models present in the system). In addition, the state saved for some device models closely matches the internal data structures which might prove a challenge for compatibility of snapshot files across a range of bhyve versions. The file format also does not currently support versioning of individual chunks of state. As a result, the current file format is not a fixed binary format and future revisions to save and restore will break binary compatiblity of snapshot files. The goal is to move to a more flexible format that adds versioning, etc. and at that point to commit to providing a reasonable level of compatibility. As a result, the current implementation is not enabled by default. It can be enabled via the WITH_BHYVE_SNAPSHOT=yes option for userland builds, and the kernel option BHYVE_SHAPSHOT.
Submitted by: Mihai Tiganus, Flavius Anton, Darius Mihai Submitted by: Elena Mihailescu, Mihai Carabas, Sergiu Weisz Relnotes: yes Sponsored by: University Politehnica of Bucharest Sponsored by: Matthew Grooms (student scholarships) Sponsored by: iXsystems Differential Revision: https://reviews.freebsd.org/D19495
show more ...
|
#
00d3723f |
| 20-Apr-2020 |
Conrad Meyer <cem@FreeBSD.org> |
vmm(4): Bump VM_MAX_MEMMAPS for vmgenid
As a short term solution for the problem reported by Shawn Webb re: r359950, bump the maximum number of memmaps per VM. This structure is 40 bytes, and the ad
vmm(4): Bump VM_MAX_MEMMAPS for vmgenid
As a short term solution for the problem reported by Shawn Webb re: r359950, bump the maximum number of memmaps per VM. This structure is 40 bytes, and the additional four (fixed array embedded in the struct vm) members increase the size of struct vm by 3%.
(The vast majority of struct vm is the embedded struct vcpu array, which accounts for 84% of the size -- over 4 kB.)
Reported by: Shawn Webb <shawn.webb AT hardenedbsd.org> Reviewed by: grehan X-MFC-With: r359950 Differential Revision: https://reviews.freebsd.org/D24507
show more ...
|
#
b33a8b38 |
| 16-Feb-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r357966 through r357999.
|
#
b40598c5 |
| 15-Feb-2020 |
Pawel Biernacki <kaktus@FreeBSD.org> |
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (4 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marke
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (4 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Reviewed by: kib Approved by: kib (mentor) Differential Revision: https://reviews.freebsd.org/D23625 X-Generally looks fine: jhb
show more ...
|
#
74dc6beb |
| 14-Feb-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r357855 through r357920.
|
#
caab5042 |
| 13-Feb-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
vmm: Add Hygon Dhyana support.
Submitted by: Pu Wen <puwen@hygon.cn> Discussed with: grehan Reviewed by: jhb (previous version) MFC after: 1 week Differential revision: https://reviews.freebsd.org/D
vmm: Add Hygon Dhyana support.
Submitted by: Pu Wen <puwen@hygon.cn> Discussed with: grehan Reviewed by: jhb (previous version) MFC after: 1 week Differential revision: https://reviews.freebsd.org/D23553
show more ...
|
#
b837dadd |
| 02-Jan-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
bhyve: terminate waiting loops if thread suspension is requested.
PR: 242724 Reviewed by: markj Reported and tested by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com> (previous version) Sponso
bhyve: terminate waiting loops if thread suspension is requested.
PR: 242724 Reviewed by: markj Reported and tested by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com> (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22881
show more ...
|
Revision tags: release/12.1.0 |
|
#
869dbab7 |
| 19-Oct-2019 |
Andriy Gapon <avg@FreeBSD.org> |
vmm: remove a wmb() call
After removing wmb(), vm_set_rendezvous_func() became super trivial, so there was no point in keeping it.
The wmb (sfence on amd64, lock nop on i386) was not needed. This
vmm: remove a wmb() call
After removing wmb(), vm_set_rendezvous_func() became super trivial, so there was no point in keeping it.
The wmb (sfence on amd64, lock nop on i386) was not needed. This can be explained from several points of view.
First, wmb() is used for store-store ordering (although, the primitive is undocumented). There was no obvious subsequent store that needed the barrier.
Second, x86 has a memory model with strong ordering including total store order. An explicit store barrier may be needed only when working with special memory (device, special caching mode) or using special instructions (non-temporal stores). That was not the case for this code.
Third, I believe that there is a misconception that sfence "flushes" the store buffer in a sense that it speeds up the propagation of stores from the store buffer to the global visibility. I think that such propagation always happens as fast as possible. sfence only makes subsequent stores wait for that propagation to complete. So, sfence is only useful for ordering of stores and only in the situations described above.
Reviewed by: jhb MFC after: 23 days Differential Revision: https://reviews.freebsd.org/D21978
show more ...
|
#
8b3bc70a |
| 08-Oct-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r352764 through r353315.
|