api.rst (c21d54f0307ff42a346294899107b570b98c47b5) | api.rst (fb04a1eddb1a65b6588a021bdc132270d5ae48bb) |
---|---|
1.. SPDX-License-Identifier: GPL-2.0 2 3=================================================================== 4The Definitive KVM (Kernel-based Virtual Machine) API Documentation 5=================================================================== 6 71. General description 8====================== --- 248 unchanged lines hidden (view full) --- 257:Type: system ioctl 258:Parameters: none 259:Returns: size of vcpu mmap area, in bytes 260 261The KVM_RUN ioctl (cf.) communicates with userspace via a shared 262memory region. This ioctl returns the size of that region. See the 263KVM_RUN documentation for details. 264 | 1.. SPDX-License-Identifier: GPL-2.0 2 3=================================================================== 4The Definitive KVM (Kernel-based Virtual Machine) API Documentation 5=================================================================== 6 71. General description 8====================== --- 248 unchanged lines hidden (view full) --- 257:Type: system ioctl 258:Parameters: none 259:Returns: size of vcpu mmap area, in bytes 260 261The KVM_RUN ioctl (cf.) communicates with userspace via a shared 262memory region. This ioctl returns the size of that region. See the 263KVM_RUN documentation for details. 264 |
265Besides the size of the KVM_RUN communication region, other areas of 266the VCPU file descriptor can be mmap-ed, including: |
|
265 | 267 |
268- if KVM_CAP_COALESCED_MMIO is available, a page at 269 KVM_COALESCED_MMIO_PAGE_OFFSET * PAGE_SIZE; for historical reasons, 270 this page is included in the result of KVM_GET_VCPU_MMAP_SIZE. 271 KVM_CAP_COALESCED_MMIO is not documented yet. 272 273- if KVM_CAP_DIRTY_LOG_RING is available, a number of pages at 274 KVM_DIRTY_LOG_PAGE_OFFSET * PAGE_SIZE. For more information on 275 KVM_CAP_DIRTY_LOG_RING, see section 8.3. 276 277 |
|
2664.6 KVM_SET_MEMORY_REGION 267------------------------- 268 269:Capability: basic 270:Architectures: all 271:Type: vm ioctl 272:Parameters: struct kvm_memory_region (in) 273:Returns: 0 on success, -1 on error --- 6117 unchanged lines hidden (view full) --- 6391----------------------------- 6392 6393Architectures: x86 6394 6395When enabled, KVM will disable paravirtual features provided to the 6396guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf 6397(0x40000001). Otherwise, a guest may use the paravirtual features 6398regardless of what has actually been exposed through the CPUID leaf. | 2784.6 KVM_SET_MEMORY_REGION 279------------------------- 280 281:Capability: basic 282:Architectures: all 283:Type: vm ioctl 284:Parameters: struct kvm_memory_region (in) 285:Returns: 0 on success, -1 on error --- 6117 unchanged lines hidden (view full) --- 6403----------------------------- 6404 6405Architectures: x86 6406 6407When enabled, KVM will disable paravirtual features provided to the 6408guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf 6409(0x40000001). Otherwise, a guest may use the paravirtual features 6410regardless of what has actually been exposed through the CPUID leaf. |
6411 6412 64138.29 KVM_CAP_DIRTY_LOG_RING 6414--------------------------- 6415 6416:Architectures: x86 6417:Parameters: args[0] - size of the dirty log ring 6418 6419KVM is capable of tracking dirty memory using ring buffers that are 6420mmaped into userspace; there is one dirty ring per vcpu. 6421 6422The dirty ring is available to userspace as an array of 6423``struct kvm_dirty_gfn``. Each dirty entry it's defined as:: 6424 6425 struct kvm_dirty_gfn { 6426 __u32 flags; 6427 __u32 slot; /* as_id | slot_id */ 6428 __u64 offset; 6429 }; 6430 6431The following values are defined for the flags field to define the 6432current state of the entry:: 6433 6434 #define KVM_DIRTY_GFN_F_DIRTY BIT(0) 6435 #define KVM_DIRTY_GFN_F_RESET BIT(1) 6436 #define KVM_DIRTY_GFN_F_MASK 0x3 6437 6438Userspace should call KVM_ENABLE_CAP ioctl right after KVM_CREATE_VM 6439ioctl to enable this capability for the new guest and set the size of 6440the rings. Enabling the capability is only allowed before creating any 6441vCPU, and the size of the ring must be a power of two. The larger the 6442ring buffer, the less likely the ring is full and the VM is forced to 6443exit to userspace. The optimal size depends on the workload, but it is 6444recommended that it be at least 64 KiB (4096 entries). 6445 6446Just like for dirty page bitmaps, the buffer tracks writes to 6447all user memory regions for which the KVM_MEM_LOG_DIRTY_PAGES flag was 6448set in KVM_SET_USER_MEMORY_REGION. Once a memory region is registered 6449with the flag set, userspace can start harvesting dirty pages from the 6450ring buffer. 6451 6452An entry in the ring buffer can be unused (flag bits ``00``), 6453dirty (flag bits ``01``) or harvested (flag bits ``1X``). The 6454state machine for the entry is as follows:: 6455 6456 dirtied harvested reset 6457 00 -----------> 01 -------------> 1X -------+ 6458 ^ | 6459 | | 6460 +------------------------------------------+ 6461 6462To harvest the dirty pages, userspace accesses the mmaped ring buffer 6463to read the dirty GFNs. If the flags has the DIRTY bit set (at this stage 6464the RESET bit must be cleared), then it means this GFN is a dirty GFN. 6465The userspace should harvest this GFN and mark the flags from state 6466``01b`` to ``1Xb`` (bit 0 will be ignored by KVM, but bit 1 must be set 6467to show that this GFN is harvested and waiting for a reset), and move 6468on to the next GFN. The userspace should continue to do this until the 6469flags of a GFN have the DIRTY bit cleared, meaning that it has harvested 6470all the dirty GFNs that were available. 6471 6472It's not necessary for userspace to harvest the all dirty GFNs at once. 6473However it must collect the dirty GFNs in sequence, i.e., the userspace 6474program cannot skip one dirty GFN to collect the one next to it. 6475 6476After processing one or more entries in the ring buffer, userspace 6477calls the VM ioctl KVM_RESET_DIRTY_RINGS to notify the kernel about 6478it, so that the kernel will reprotect those collected GFNs. 6479Therefore, the ioctl must be called *before* reading the content of 6480the dirty pages. 6481 6482The dirty ring can get full. When it happens, the KVM_RUN of the 6483vcpu will return with exit reason KVM_EXIT_DIRTY_LOG_FULL. 6484 6485The dirty ring interface has a major difference comparing to the 6486KVM_GET_DIRTY_LOG interface in that, when reading the dirty ring from 6487userspace, it's still possible that the kernel has not yet flushed the 6488processor's dirty page buffers into the kernel buffer (with dirty bitmaps, the 6489flushing is done by the KVM_GET_DIRTY_LOG ioctl). To achieve that, one 6490needs to kick the vcpu out of KVM_RUN using a signal. The resulting 6491vmexit ensures that all dirty GFNs are flushed to the dirty rings. |
|